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Chapter 1 


Introduction 


Ask anyone to define an operating system. I will bet that for every ques- 
lion you ask, you will receive a different answer. You may ask, “Is the 
definition of an operating system so vague?” I’m happy to answer — no. 
However, operating systems do mean different things to different people. 

To many computer users, it is the full collection of utilities and the 
kernel that drives their computer to run their favorite applications. To 
engineers working with small microprocessor systems, it may be the 
kernel and drivers together that makes a complex control system work. 
No matter what the definition, an operating system of one form or 
another is an integral part of software engineers’ and users’ lives. 

Also, operating systems are the subject of heated debates and argu- 
ment. The debates range from the OS/2 versus Windows battles to the 
Berkeley versus System V UNIX debates. Lately, with the introduction 
of WindowsNT, the WindowsNT versus UNIX debate dominates dis- 
cussions in many a workplace. These discussions typically result in 
heated debates that approach religious fervor over the choice of an 
operating system. 
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A Brief History of Desktop Operating Systems 


Historically in the minicomputer and mainframe world, there were as 
many operating systems as there were computer vendors. Each vendor 
would design a unique operating system that took advantage of its 
unique hardware. Sometimes, the operating system changed drastically 
from one revision to another as the hardware changed, requiring major 
rewrites of application code. Fortunately, this was not the case for the 
desktop computing environment. 

With the introduction of microprocessors, many individuals saw an 
Opportunity to create small desktop computers. An 8-bit microproces- 
sor, such as the Intel 8080 or Motorola 6800, was the central processing 
unit for the early desktop computers. Memory limitations were on the 
order of a few kilobytes, and some rather amazing applications ran in 
this small space. As the popularity of these computers grew, so did a 
small cottage industry serving hobbyists and small businesses. 

Early users of these computers recognized advantages in shanng 
technology, and some early computer conferences attempted to stan- 
dardize media formats and storage methods so programs be could 
exchanged. The storage medium for early desktop computers was audio 
cassette, with recording formats such as Tarbell and Kansas City Stan- 
dard. Later, 8-inch and 51/4-inch floppy disks became affordable, and 
small operating systems such as CP/M and Flex further promoted stan- 
dardization while still allowing room for experimentation. 

The market for the desktop computer grew rapidly as did the com- 
plexity of microprocessors. Processors such as the Intel 8086 and the 
Motorola 68000 began promoting 16-bit and 32-bit capabilities, and 
companies such as Apple and IBM took advantage of these new proces- 
sors. Operating systems developed for these new systems spawned a 
new generation of applications. These companies also spent consider- 
ably more time and money in planning markets than the earlier cottage 
industry that sprung up around desktop computers. Of particular inter- 
est to us is a case study of the joint IBM and Microsoft approach with 
MS-DOS. Its evolution will help shed some light on features and func- 
tions within MS-DOS. 
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MS-DOS 


IBM’s entry into the desktop market was with the IBM-PC. This desk- 
top computer was the product of a skunk-works team within IBM that 
featured an Intel 8088 microprocessor as its CPU and 64Kb of RAM. 
Originally introduced with only cassette tape as standard storage, the 
IBM-PC also offered a 51/4-inch floppy disk as an option. To manage 
the diskette, IBM licensed the MS-DOS operating system from 
Microsoft. 

The original design of MS-DOS evolved from an operating system 
developed by a Seattle company for its S-100-based desktop comput- 
ers. Microsoft purchased all rights to it, because operating systems 
were new to them and IBM had approached them for software to run on 
its new PC. This system became the source code base for MS-DOS. 

The MS-DOS precursor was similar to the CP/M operating system 
for the Intel 8080 and required little RAM, a distinct advantage because 
the 80x86 family is similar in design to the 8080. By design, the operat- 
ing system provided a similar API (Application Programming Inter- 
face) to CP/M. As a result, MS-DOS 1.0 held very closely to the CP/M 
model. When combined with a familiar architecture, many developers 
were able to adapt their CP/M-based applications to MS-DOS. Favor 
quickly shifted toward Microsoft to the detriment of Digital Research, 
whose CP/M-86 operating system was the primary competition. 

As the market matured, so did both MS-DOS and the PC. Demands 
for MS-DOS began to grow from outside IBM, with IBM leading the 
pack in technology. The next generation of desktop computers from 
IBM was the XT. This computer was similar to the PC but featured 
more memory and a 10Mb internal disk drive. Its motherboard also was 
capable of much more memory: 256Kb instead of 64Kb. IBM’s target 
for this more powerful system was the business market. Previously, 
only large businesses could afford any kind of computing power. The 
XT addressed the needs of smaller businesses. 
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In order to address this new market, Microsoft developed a new 
strategy for their products as well. They adopted a dual approach to 
desktop computing environments. To manage the new internal disk 
drive, Microsoft adapted its XENIX operating system (an AT&T UNIX v7 
derivative) for the XT. They also enhanced the MS-DOS operating system 
to support larger storage devices and added a XENIX-like hierarchical file 
system. Microsoft developers enhanced the MS-DOS API with many 
XENIX-like calls, which enabled easy migration of applications devel- 
oped for a personal computer market based on MS-DOS to a business 
market based on XENIX. Microsoft also enhanced the MS-DOS com- 
mand line interpreter, command.com, with I/O redirection and pipes 
similar to XENIX. This redesigned MS-DOS now became the basis for 
all new MS-DOS versions. 

The lock-step product announcements continued with both IBM and 
Microsoft again adding new products in unison. IBM introduced the 
AT, based on an Intel 80286, and Microsoft released another version of 
MS-DOS to support it. At this time, Microsoft began to abandon its earlier 
XENIX product in favor of LAN connectivity and GUI interfaces. IBM 
also started a joint development with Microsoft for OS/2. This new operat- 
ing system would take advantage of memory management and other 
advanced features that the new processor offered. The need for a new 
operating system was clear — MS-DOS lacked protection mechanisms, 
and it was very easy for a program crash to destroy the operating system. 
The Ctrl-Alt-Delete key sequence became very familiar to users, and a 
huge market for fixed-disk reconstruction programs was bom. 

For migration of MS-DOS applications to OS/2, IBM and Microsoft 
developed a strategy similar to their market entry strategy. They would 
provide all the functions that were previously available so that developers 
could quickly adapt existing applications to the new operating system. 
Virtual devices and API entries similar in functionality to BIOS calls elim- 
inated the need for int xx-style real-mode device control, while retaining 
the functionality familiar to developers. With API calls defined in C, OS/2 
encouraged greater developer productivity through the use of high-level 
languages and broke the traditional register call model that both CP/M and 
MS-DOS sported. Unfortunately, a design limitation of the 80286 forced 
some serious incompatibilities between OS/2 and MS-DOS. 
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The Intel 80286 featured two modes of operation, a real mode and a 
protected mode. The 80286 protected mode sported a new memory 
management scheme that modified the processor behavior so that mem- 
ory access is very different from real mode. Any access to memory uses 
the contents of descriptors, in place of segment registers, to look up a 
new base address and add it to the offset address. The descriptor plus 
offset scheme expanded memory capacity from | Mb in the 8086/8088 
to 16Mb in the 80286. An unfortunate side effect of this scheme is that 
real-mode programs using segment plus offset addressing will not port 
easily to protected mode. 

Additionally, a significant design oversight on Intel’s part existed in 
the 80286. The processor could switch from real mode to protected 
mode in order to take advantage of features offered by the new operat- 
ing system, but it could not switch back. Once in protected mode, the 
only way back was a hardware reset to the 80286. IBM designed new 
features into both the BIOS and hardware of the AT to make the switch 
back to real mode, but the context switch time was very high. Thus, 
new multitasking OS/2 applications worked in protected mode only, 
and MS-DOS real-mode applications were limited to a single task. 

The OS/2 protected-mode environment was similar to environments 
found on larger computers but was incompatible with existing MS-DOS 
applications. This limited its popularity — users did not want to drop 
all their old applications and spend much more for special protected- 
mode applications — and generated bad press about OS/2. When com- 
bined with rumors of discord between the successful IBM/Microsoft 
team, users stayed with MS-DOS. Microsoft decided at this point to 
create a migration path that would always include MS-DOS compati- 
bility for all its new operating systems. 

New directions from Microsoft included a new GUI environment, 
Windows, that layered MS-DOS. For all subsequent releases after 3.3, 
MS-DOS and Windows became more tightly coupled so that both could 
cooperatively take advantage of 80386+ processors. These new proces- 
sors featured improved memory management, linear addressing, and a 
V86 mode. The V86 mode allowed real-mode programs to function 
while taking advantage of operating system services for process protec- 
tion and multitasking. Through it all, MS-DOS remained the foundation 
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for 80x86 desktop computing. This may change in the future, now that 
early machines are being retired from service, and Microsoft, feeling 
relief from supporting the older real-mode only computers, has many 
new, advanced operating systems in the wings. 


DOS-C 


DOS-C started as an experiment in writing device drivers using C for 
MS-DOS v3.1. Many articles and books that discussed the foundation 
of a DOS device driver had appeared on the scene; however, it seemed 
that each article described code written in assembly language only. In a 
discussion with some colleagues, I argued that this was unnecessary, 
citing UNIX as an example. As a result of these discussions, I began to 
write DOS device drivers in C. I developed both block and character 
device drivers, along with special C data structures to match the DOS 
request packet. This effort proved to be very successful, and I decided 
to continue with personal research to expand on these techniques. The 
goal: to create a full operating system by using the same techniques 
used to create the device drivers. 

I began the design of a new architecture that would use assembly 
language only at the call interface and for process management, such as 
stack manipulation and context switching. The implementation of this 
Operating system would take advantage of the C language features and 
require fewer resources to develop than the traditional assembly lan- 
guage designs. Although UNIX proved this concept for large multi-user 
operating systems, there were no documented instances for a traditional 
PC operating system. 

Building on this knowledge, I developed a minimal operating sys- 
tem using the device drivers written earlier, along with a new 8086 
interrupt API. Known as XDOS, it proved to be a functional operating 
system. This new operating system became the vehicle to explore boot- 
ing techniques, and I developed a C library SDK (Software Develop- 
ment Kit) for it. 

I later began to enhance XDOS and chose DOS as the new API. 
These enhancements included a more advanced architecture that uses 
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an IPL (Intermediate Program Loader) to set up the operating environ- 
ment before loading the operating system itself. Additionally, reentrant 
system calls for real-time applications became part of the design. Using 
the name NSS-DOS (bearing the initials of the consulting firm I was 
part of in place of Microsoft’s MS), I demonstrated this version to a few 
friends and business acquaintances. As a result of these demonstrations, 
a major defense contractor approached my consulting firm for a source 
license for this operating system. The only new requirement — it had to 
run on 680x0 processors. 

This presented a new challenge. Because of the MS-DOS model 
used for the API, NSS-DOS relied heavily on a segmented architecture. 
To meet this challenge, I began a major redesign of NSS-DOS. Devel- 
opment of new portability techniques allowed compilation of the same 
source code base on a variety of hosts and with a wide range of compil- 
ers and target processors. The new version, DOS/NT, resulted from this 
work. The redesigned kermel used microkermel techniques with logical 
separation of the file system, memory, and task managers. The new 
design included a portable DOS API along with a new DOS SDK to 
guarantee portability. By removing all processor-unique code from the 
core functions, a highly portable operating system resulted. 

The version described in this book, DOS-C, is derived from DOS/NT 
and is closer in design to MS-DOS: the kermel is one large program as 
opposed to many small programs. It is a nonmultitasking operating sys- 
tem that provides a large number of system calls similar to MS-DOS. 
This simplified design provides an operating system that readers can 
study without getting lost in the complexity that a microkemel entails. 
It also provides an excellent source code base to experiment on or with. 


Why Roll Your Own 


Although I discussed the reasons for the development of DOS-C, there 
are probably many readers who are wondering why bother at all. After 
all, a professional software company with a great deal of resources 
behind it wrote and maintains MS-DOS. Why would anyone develop a 
simple clone of a successful operating system? 
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Educational Tool 


The first application for such an operating system is as an educational tool. 
Without a collection of reverse engineering tools, it is impossible to exam- 
ine MS-DOS in any great detail. Anyone who has ever looked at mainte- 
nance of a software system will tell you that trying to get into the original 
programmer’s head to figure out the reasons for a particular design, without 
proper documentation, is a monumental task — especially at the disassem- 
bled assembly code level. My compliments to all the folks who are pounng 
out reams of “undocumented documentation.” Reverse engineering 
MS-DOS may also break any existing licensing agreements between you 
and Microsoft, so check your license before applying a disassembler to it. 

My guess is that MS-DOS source code is almost entirely in assem- 
bly language, and the kernel itself is guaranteed to be assembly lan- 
guage. Even if you did see the source, it would be more difficult to 
follow than if coded in a high-level language. My goal of minimizing 
assembly language yields two benefits: the code is portable and easier 
to understand. Although a high-level language in itself does not neces- 
sarily guarantee portability, it is easier to port C source code than 
assembly source code. Also, conditional code and looping constructs 
are easier to understand in C than assembly language. 

As an educational tool, you can replace any section of the tested oper- 
ating system. This allows you to experiment with different algorithms. 
You may want to change the LRU (Least Recently Used) buffer algo- 
nthm for a different one and compare performance. You may also want to 
learn about the different algorithms in order to apply a similar technique 
to a different application. For example, loadable device drivers may be 
useful in a stand-alone application where the host uses different devices 
in different configurations, or the buffered keyboard input code may pro- 
vide enough insight for an interactive display system. 

Finally, you can test the code on a host MS-DOS or UNIX machine 
using a source-level debugger because the majority of DOS-C is C 
code. For example, performing DOS/NT code tests on a 680x0-based 
UNIX machine proved invaluable. Many of the “big-endian” versus 
““little-endian” issues surfaced quickly during the unit testing of indi- 
vidual components of the operating system. 
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Embedded Systems 


Possibly one of the most useful applications is as an embedded operat- 
ing system. Many times, an application requires the functionality of an 
operating system, such as file storage, embedded databases, or sophisti- 
cated device control. A general-purpose operating system used in an 
embedded system may not do the trick. Additionally, it may not be pos- 
sible to develop the application on the host because of a lack of hard- 
ware or support tools. 

You can simplify the design of a typical embedded system by using 
a DOS PC for application development and later linking to your DOS 
to create an application ROM. You can debug the application on the 
host PC and define all the DOS entry points used. Later, you can either 
replace the DOS functionality or use an embedded operating system 
with custom device drivers for your unique hardware. The embedded 
operating system approach eliminates work and becomes a tool for the 
developer. Additionally, the embedded operating system becomes an 
abstraction layer between the application and the embedded system 
hardware. 


Non-Intel Applications 


When the operating system design is portable, new opportunities open 
up for both traditional and embedded applications. One possibility in 
the area of embedded systems is to design your system and develop it 
as you would for a normal PC embedded system, then use cross-devel- 
opment tools to recompile the application. When combined with the 
cross-compiled operating system, software verification reduces to the 
level of regression testing on the new hardware. 

Another reason for developing your own operating system is simply 
to adapt it to a new platform. There are many non-Intel processor sin- 
gle-board computers. To approach a company such as Microsoft to 
adapt MS-DOS to this custom hardware would be both time and cost 
prohibitive. However, with an operating system designed to be porta- 
ble, moving the operating system to a new target is a question of cross- 
development tools. 


10 — The FreeDOS Kernel 


DOS Clone 


Another way to view DOS-C is as a declaration of independence. As 
the desktop computer market matures, Microsoft is moving away from 
MS-DOS as the foundation of its GUI products. Another example of 
the trend away from DOS is the decreasing size of the DOS clone mar- 
ket. Products such as DR-DOS are rapidly disappearing from the scene, 
which leaves older applications in an orphan state. Creating your own 
version of DOS solves this problem. 


Extending DOS 


Have you, as a developer, ever looked at the DOS API and felt that if it 
had just one additional feature, your professional life would be easier? 
Taking the challenge head-on and creating your own version of DOS is 
an excellent way of extending DOS. 

A typical way to extend DOS is with the Novell approach. When they cre- 
ated their network product, it was as an extension to the existing MS-DOS 
int 21hcall. They captured the call and searched their own system calls. If 
it matched one, the network product would service it. Otherwise, the proces- 
sor was restored to its original state and the call was passed through to DOS. 
Microsoft also adopted this interrupt chaining with its standard int 2fh 
multiplex interrupt chain in v3 and above. 

With your own DOS, you can extend system calls to cover your own 
special call by simply adding it to the function call handler. You wnite 
an extension, test it, and integrate it into your DOS. The simplicity 
behind this type of customization gives you, the developer, enormous 
power over your applications environment. 
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Development Environment 


I expect to hear you now say: “Gee, this all sounds great, but what 
expensive tools do I need to do this?” As it turns out, you do not neces- 
sarily need any. By using a native 80x86 platform to develop for and 
test on, popular PC compilers from companies such as Borland and 
Microsoft are suitable as long as you follow special guidelines. Also, 
there is no need for special linkers if you design your architecture cor- 
rectly. As I will describe in later chapters, the use of an IPL simplifies 
the files on disk, and youcan use common .comand .exe file formats. 
Testing also becomes simpler through the use of some low-cost tools. 
With all this in mind, choose your tools. 


Compiler 


First, choose a C compiler. The version of DOS presented in this book 
compiles with Borland C (Borland International, Scotts Valley, Calif.), 
Microsoft C (Microsoft Corp., Redmond, Wash.) and other C cross- 
compilers. Special attention to the use of portable techniques pays off 
when using simple development tools. As it turns out, the only area that 
ties a given C compiler to its host operating system is the C runtime 
library. Avoid its use to ensure that your code will run stand-alone with 
the low-cost compilers. 

Beware of floating point code used by the C compiler. Some compil- 
ers generate non-reentrant floating point code or code that uses non- 
reentrant library calls. Many times, a compiler’s start-up code will take 
over a number of interrupt vectors. These vectors capture floating point 
exceptions generated when a call to a nonexistent coprocessor occurs. 
This is a common technique for automatic switching between copro- 
cessor floating point calculations and software emulation of the copro- 
cessor. For all these reasons, and more, avoidance of floating point code 
helps. DOS-C avoids all floating point code by avoiding the use of C 
library calls such as printf (), further simplifying the choice. 
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Assembler 


Besides the C compiler, you will need an assembler. The need for this is 
twofold. First, the standard start-up code delivered with a native C com- 
piler assumes a host machine and operating system. You will be work- 
ing with a “naked” machine that does not have an operating system 
available (remember, you are creating it). As a result, you need to sub- 
stitute your own start-up code that will take care of memory initializa- 
tion, stack and register initializations, etc., and finally call main(). 
Second, you will use the de facto DOS standard. This means that you 
use the same registers for system calls, switch stacks during system 
calls, etc. 

The support code must be written in assembly language, allowing us 
full control of the target machine. The asm shortcut provided by many 
C compilers simply does not suffice because we must move away from 
the C model for this support. You cannot afford stack register and frame 
pointer manipulations, and many times you will not generate code that 
resembles a C function. 


Linker 


Look at the linker provided with the C compiler you choose. For our 
80x86 version, you can use the standard DOS linkers. However, for 
many platforms they may not be suitable. You may need some form of 
“strip” utility as typically supplied on many UNIX platforms so that 
you can adequately create an IPL. As you will see later, our architecture 
specifies that the kernel is an executable file for a known format. Our 
choice of the .exe file type simplified our decision, but you will need 
to be careful when examining other platforms and targets. 
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Locator 


If you have decided to place the operating system into ROM, or if you 
have a special target machine architecture that requires you to fix the 
operating system in memory, you may also need a locator. This type of 
utility comes in various shapes and sizes and may be split between the 
linker and various other utilities, but it always serves the same purpose: to 
fix all memory references to absolute addresses and possibly to generate 
a special file format acceptable to various device programmers. There are 
two examples of this type of utility set. The first example is as a stand- 
alone locator. Multiple vendors in the 80x86 community have such utili- 
ties toconvert DOS .exe files into “hex” files. UNIX users may find that 
their linker, 1d, accepts special specification files and that other utilities 
will convert the resulting file into a hex file. Check your manuals because 
this is a function of the vendor whose UNIX you are running. 


Debugger 


With all your development tools in place, you will need to begin think- 
ing about debugging tools. This choice is dictated by a number of fac- 
tors. The most significant is your target architecture. If you plan to use 
your operating system in ROM, then a tool such as a ROM emulator 
that will accept a download from your development platform and emu- 
late your target ROM(s) may be a good choice. This device will permit 
you to download your code directly, but you will need some other tool 
in combination with the ROM emulator to assist you in debugging. 

If your budget permits, you may want to look into an in-circuit emu- 
lator. This tool replaces the processor on your target hardware and 
allows you to electronically look into your target. They typically pro- 
vide all the favorite debugging methods available on host debuggers, 
such as single-step and break point. Additionally, they usually allow 
you to stop a processor that has stopped responding — a highly desir- 
able feature for anyone who has experienced such a lock-up while run- 
ning DOS-based native debuggers. Also, many other hardware features, 
such as trace buffers, are usually available for in-circuit emulators. This 
type of buffer allows you to look into the history of your program so 
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that if you wonder how you got to a certain point in your program, the 
trace buffer will point it out. Finally, if source-level debugging is an 
option, you should buy it. It will simplify your job by orders of magni- 
tude. Just make certain that it supports your development tools. 

If you are on a limited budget or you are using DOS-C as a learning 
tool, you will want to look into remote debuggers. Many of you are 
probably already familiar with the remote debugging features of debug- 
gers such as Turbo Debugger (Borland International) or CodeView 
(Microsoft Corp.). These run on a remote DOS machine, but the type of 
remote debugger you will want to use is one that runs stand-alone. They 
sometimes require some work to adapt to the target hardware but are 
well worth the effort. Some will also work in combination with ROM 
emulators, making the combined tools almost as powerful as an in-cir- 
cuit emulator. 


Getting Started 


Now that I have briefly touched on why and how to write your own 
operating system, I will begin to build DOS-C. I will approach its 
development in a logical fashion by first developing a set of require- 
ments. I will examine MS-DOS from the outside in order to assist in the 
development of a set of general specifications. I will develop a general- 
purpose method of examining error recovery and apply it to round out 
our requirements. When completed, I will proceed to develop the oper- 
ating system itself. 


The full source code for this book is now available on the pub- 
lisher’s ftp site at ftp.mfi.com/pub/rdbooks/FreeDOS. zip; 
login as “anonymous” and download the file. Any references to 
the “companion code diskette” in the book now refer to the code 
available on the ftp site. See Appendix C for more details or 
check the FreeDOS home page at http://sunsite.unc.edu/ 
pub/micro/pc-stuff/freedos or my personal web page at 
http: /www.monmouth.com/user_pages/patv for updates. 


Chapter 2 


DOS Basics 


Where to Start 


Many programmers have come to take MS-DOS for granted. Its API 
has become a standard in the industry and is known by developers 
world-wide. Many books, published by Microsoft and others, describe 
the interface. Many other books contain information about writing 
applications for DOS, and others talk about how to extend it. Still oth- 
ers describe how to write extenders, device drivers, etc. Additionally, 
periodicals dedicate space to and books are published about the 
“undocumented” system calls and data structures. 
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Although Microsoft produced a great deal of programmer and user 
documentation for MS-DOS, there is plenty of undocumented informa- 
tion about MS-DOS internals. Proof of this is the sheer volume of 
“undocumented” books and articles published about MS-DOS. However, 
even with the undocumented sources, the target audience for this docu- 
mentation is application programmers who want to emulate some of 
Microsoft's private programming tricks, not for programmers who want 
to design their own version of DOS. Very little documentation exists 
about MS-DOS intemals. You may need to provide the same interface as 
MS-DOS and in some cases emulate MS-DOS internals in your design in 
order to achieve any level of binary compatibility. 

Whenever possible, I followed reference articles and texts during the 
design. These references are very good and cover a good portion of 
what you need to know. However, as pointed out earlier, questions will 
come up where you will need to perform little experiments to find out 
how DOS behaves. One reason is that many times, a programmer’s ref- 
erences cover a subset of error responses; however, you will need to 
cover other undocumented instances. Another reason is that you can 
gain insight into the DOS design by running a test program that exer- 
cises a system call. 

Exercising system calls through varying conditions is one of the 
best ways to study DOS. Many times, you don’t know what algorithm 
the MS-DOS designers used. Quite often, it may be possible for many 
algorithms to perform certain functions. Unlike an open system where 
all functions of the operating system adhere to a well-known standard, 
Microsoft created the system functions and modified them as they 
went along. In some cases, the changes occurred as bug fixes or as the 
result of developer learning curves. In other cases, Microsoft created 
new system calls to take advantage of new features introduced with a 
release. As a result, many applications have become dependent on 
these features and any little quirks that they may exhibit. You need to 
emulate these features closely so that your version of DOS doesn’t 
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break existing applications. Although more than one algonthm may 
achieve a given function, you must choose the algorithms for your DOS 
that closely emulate the original MS-DOS version. 

This chapter contains a sample experiment whose purpose is to 
examine the minimum set of files and other objects that MS-DOS 
requires to run. The goal is to demonstrate a methodology that you can 
use whenever a design question arises. From there, the chapter covers 
the basic DOS boot procedure, physical disk organization, API, file sys- 
tem, memory management, and device driver. 


Basic MS-DOS Architecture 


The way I like to study any new subject is by starting with the overall 
picture. I like to peel away the different layers, uncovering new infor- 
mation with each new layer. I find this approach helps me comprehend 
anew subject better because I will not get lost in the details too early in 
the game. I will take this approach and begin by looking at the four 
basic MS-DOS files: boot, i0.sys, msdos.sys, and command.com. You 
may ask, “How do I know that these are the basic MS-DOS files?” Per- 
form a simple experiment to find out. 

You will format a bootable floppy disk as the basis of this experi- 
ment. Take a blank disk, place it into drive A, and type format a: /s 
at the DOS prompt. MS-DOS chugs along and gives you a bootable 
disk. You know it’s bootable because you can now hit Ctrl-Alt-Delete 
and up comes MS-DOS. You can copy files, delete files, execute other 
programs, etc. In short, every essential MS-DOS function 1s available at 
the new prompt. 
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Now take a closer look at this bootable disk and see what MS-DOS has 
done to it, aside from laying out track and sector structure to make it boot- 
able. When you examine the directory, you find one file, command. com, on 
it. This file is the MS-DOS command line interpreter or shell. If you do a 
dir a: /a:h, you find two hidden files, io.sys and msdos.sys 
(or ibmbio.sys and ibmdos.sys in certain versions of MS-DOS and 
PC-DOS). It seems that Microsoft considers these three files to be the min- 
imum components needed for MS-DOS to operate. 

This is interesting, but you know from the references that one other 
component will not appear in any directory listing because it is not a 
true file: the boot sector. The boot sector is the first sector placed onto 
the disk by the format command after the disk physical format opera- 
tion. The PC architecture defines its function, and it is a de facto stan- 
dard. All PC-style computers load and execute the boot sector in order 
to boot any operating system; it is the final component to the minimum 
set of programs for MS-DOS. 

Each of the four components has a specific role in starting MS-DOS. 
As you can see, your little experiment proves that only four components 
are necessary for a minimal DOS operating system. If you limit your 
design to only these components (or their equivalent functions), your 
design becomes easier to handle by eliminating the development of tran- 
sient commands such as attrib, format, etc. 

Having established the minimum set of programs necessary for 
MS-DOS to operate, you can examine the dynamics of these four pro- 
grams. If your computer is a standard PC-class computer like mine, 
the BIOS performs a power-on self test (POST) the instant you tum 
your computer on. The basic tests your computer may perform are a 
memory test, followed by a keyboard test, disk test, and hard drive 
test. You may also see coprocessor tests and other varied tests inter- 
spersed among the basic tests. At the end of these tests, you will hear a 
beep. The question now is, “What happens after the beep?” 
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The answer is simple: Your computer’s BIOS now attempts to boot 
the operating system from either the floppy disk or the hard disk. If it’s 
booting from a floppy, it brings the boot sector into memory at location 
0:7cOOh. This is an important address, because all BIOS guarantee to 
load the boot at this location for compatibility reasons. If the boot sec- 
tor is not at 0: 7c00h, MS-DOS will not boot properly because a closer 
examination of the boot program reveals that its address modes require 
this start address (unlike a .com or .exe file that can load anywhere 
within the x86 real mode memory). If it’s booting from a hard drive, the 
BIOS loads a master boot sector to determine the active partition and 
then loads the boot record into the same address. However, for educa- 
tional purposes, you will continue to examine a floppy disk boot. 

After loading the boot, the BIOS performs a far jump to 0:7cO0h. 
MS-DOS places a boot sector onto a disk or hard disk regardless of 
whether it is bootable or not. This protects the user from inadvertently 
attempting to boot from a nonbootable disk (unfortunately only for x86 
processors). By design, the boot checks for a bootable hard or floppy 
disk by looking at the root directory for the presence of two required 
MS-DOS files, io.sys and msdos.sys. If boot does not find them, it 
issues the well-known “non-system diskette” message and waits for a 
new disk to retry. The hard drive boot does not wait but halts, since a 
boot failure means that something is seriously wrong with the active 
partition, whereas a floppy boot error may result from a user inserting 
the wrong floppy disk into the boot drive. 

The boot next searches the root directory for io. sys and msdos.sys 
(or ibmbio.sys and ibmdos.sys). It reads each entry in the root direc- 
tory in order to determine the filenames. If these files are present, boot 
proceeds to load io.sys. It does this by getting the starting disk sector 
from the matching j0.Sys entry, copying each sector into a fixed loca- 
tion in memory, and transfering execution control to it. One significant 
boot limitation is that it can only load sequential records. Boot’s size 
limitation is significantly less than one sector, and the developers who 
created boot were very creative when it came to trading off functional- 
ity versus size. 
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Once the transfer of control takes place, i10.Sys proceeds to initial- 
ize its device drivers. These drivers are the fundamental device drivers 
that every DOS system has: floppy disk, hard drive, console, printer, and 
auxiliary serial port. Up until this point, boot communicated strictly 
through the BIOS functions in order to save space and achieve portabil- 
ity, but i0. Sys now uses its device drivers for control. 10. Sys contin- 
ues to initialize its device drivers by loading msdos.sys and instructing 
it to initialize itself. It then uses msdos.sys to read config.sys, if 
present. 10.Sys parses each line of config.sys and sets up the envi- 
ronment space, loads and initializes the device drivers, and performs 
other functions necessary for a successful boot. When i0.sys com- 
pletes all this, it loads command.com and finally transfers control to it. 

Notice that I have not mentioned the familiar DOS startup file, 
autoexec.bat. That is because it is really not part of the actual boot 
process but a convenience placed into command.com. The designers 
of MS-DOS knew that many people modified CP/M in order to exe- 
cute a program on Startup and that other operating systems, such as 
UNIX, had special files or directories of files that executed on startup 
(i.e., /etc/rc). These files are extremely useful in turnkey applica- 
tions and tailoring environments. You can perform a short experi- 
ment to verify that autoexec.bat is a function of command. com. You 
can take the sample disk formatted earlier and add a simple 
autoexec.bat file to it. Next reboot your computer to verify that the 
new file executes. Then change command.com to a simpler or differ- 
ent command line interpreter. What you will see is the same as 
before, except without the execution of autoexec.bat. By verifying 
that autoexec.bat did not execute when the command line inter- 
preter changed, you verified that its association is with command.com. 
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You now have enough data from this little experiment to generalize 


the MS-DOS boot procedure. 


I. 


Four mandatory components of a basic MS-DOS system are: 
boot, i0.sys, msdos.sys, and command.com. Two optional files 
are: config.sys and autoexec.bat. 


. To initiate a boot sequence, load the boot sector at 0: 7c0Oh and exe- 


cute it. 


. Boot must examine the disk. If bootable, it loads the first stage, 


i0.sys (or ibmbio.sys), and executes it. 


. Upon transfer of control, io0.sys loads msdos.sys (or ibmbio.sys 


loads ibmdos.sys) and allows it to initialize. io.sys then uses 
msSdos .Sys to parse and execute config.sys if present. 


. When completed, command.com is loaded and control is transferred 


to it. 


. command.com executes autoexec.bat if present and displays the 


familiar DOS prompt, awaiting user input for commands. 


You will use this information later when you design your architec- 


ture for DOS-C. 


What Do Physical Disks Look Like? 


One area of mystery to many developers is the physical location of sec- 
tors stored on a disk. Unless you’ve developed disk drivers before, 
you've probably never seen this. Even if you did write a disk driver, 
odds are that you’ve never looked at the physical file system organiza- 
tion. I’1] show you the physical and logical organization of the MS-DOS 
file system so that the mystery will finally be solved. 
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Generic Disk Organization 


Disk drives are ubiquitous. Nearly every desktop computer, worksta- 
tion, server, and mainframe has at least one. They are the preferred sec- 
ondary storage for computers. Typically, their size is many times the 
primary RAM and ROM storage contained within the computer itself. 
They’re also taken for granted. Why? Because storing data onto the 
disk is done without regard to either physical or logical organization. 
Just pause for one moment and ask yourself when was the last time you 
thought about which sectors the file you just created occupies? 

Before I get into such details, you need to understand the terms used 
to describe how the disk stores data and how the disk actually works. 
Figure 2.1 is a view of a generic disk. This disk has one or more physi- 
cal platters that hold the data. Each platter allows data to be read and 
written from either one or both surfaces due to a coating capable of 
being magnetized. This coating is similar to that used on mylar tapes 


Figure 2.1 Diagram of a generic disk architecture. 


Physical 
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that store data or audio. The data is stored by magnetizing a portion of 
this surface as the surface moves beneath a read/write head. Unlike a 
tape, where the tape is spooled from one reel to another, a disk rotates 
so that the same physical location is available once per revolution and 
data can be written and repeatedly accessed by waiting for the next rev- 
olution. This stripe of recorded data is called a track. 

This track contains all the data. You need to access this data much 
the same way you access internal memory — you need to be able to 
modify it at random intervals. In order to facilitate this, the track is bro- 
ken down into sectors that have prerecorded markers (laid down by a 
formatting program) that identifies where a sector is physically located 
on the track. By prior agreement, this sector is defined as a fixed num- 
ber of bytes designed to fit within the physical space between the prere- 
corded markers. With a fixed number of bytes contained within each 
sector, you can store the number of bytes equal to the size of the sector 
times the number of sectors recorded on each track. 

There is more than one track per disk. In fact there are two direc- 
tions in which you can distribute the tracks: on different surfaces of the 
platter and across the surface. Typically, a disk has more than one sur- 
face. It usually has two surfaces and multiple platters. By placing a 
read/write head in close proximity of each surface, you can access this 
data by electrically switching heads and reading or writing data on this 
surface. This organization is known as a cylinder because you can visu- 
alize the track as physically extending through each platter forming a 
three dimensional cylinder of data. You can now store the number of 
bytes in each track times the number of tracks in each cylinder. By 
moving the heads, you can take advantage of the surface area by form- 
ing concentric rings or tracks on each surface and concentric cylinders 
across the platters. With this additional dimension of storage, you can 
store the number of bytes in each cylinder times the number of cylin- 
ders on each disk drive. 
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Floppy Disk Drives 


To better understand how the logical MS-DOS file system organization 
relates to the physical disk, we need to select a specific disk type to 
study. Hard drives vary in architecture from vendor to vendor. Just look 
at the number of drive types in your computer’s BIOS. Floppy disks are 
much more standard and easier to study. I’ll discuss a 360Kb floppy disk. 

Like the generic disk, the data organization on a floppy disk is bro- 
ken into sectors, tracks, and cylinders. The platter on a floppy disk is 
made from mylar as opposed to aluminum or some other alloy. Because 
the mylar is very flexible when compared to the metal platter, the term 
“floppy” was applied to it. There is only a single platter with one or two 
surfaces, so a maximum of two tracks compose a cylinder. The number 
of cylinders vary, but most common is 40 or 80. 

For a 360Kb floppy, the organization is nine sectors per track, two 
tracks per cylinder and forty cylinders per disk. Other architectures are 
possible, such as 720Kb and 1|.44Mb floppy disks. These disks are also 
different physical sizes, typically 5 1/4-inch and 3 1/2-inch, but other 
sizes such as 8-inch and 3-inch existed in the past. With MS-DOS, the 
sector size is typically 512 bytes, but other sector sizes were also used 
in the past. 


Logical Versus Physical 


Both MS-DOS and DOS-C internally reference the disk drive as an 
array of fixed-size blocks of data. Each block of data in this array is 
sequentially numbered from zero to one less than the maximum size of 
the disk (in sectors). DOS-C and MS-DOS impose a structure upon this 
array of sectors in order to easily access data and organize files. This 
logical file system organization is shown in Figure 2.2. DOS-C and 
MS-DOS place a boot sector at the first sector on the disk because this 
is the best defined sector on the disk. This sector contains the code to 
load the operating system and a data structure that contains all the perti- 
nent variables that define the disk geometry. For a 360Kb floppy disk, 
the next sector is a reserved sector and it’s followed by two FAT areas, 
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each two sectors long. The last area reserved by the file system is the 
root directory and it’s 7 sectors long. The remainder of the disk is dedi- 
cated to data and subdirectories. 

Mapping this logical structure into the physical disk requires some 
thought. In order to map the disk, you need to standardize how the log- 
ical sector numbers map into physical sector, cylinder, and track num- 
bering. For MS-DOS and compatible operating systems, the first block 
is mapped into sector | of cylinder O, track |. The track number 
increases for every nine sectors on a 360Kb floppy, followed by 


Figure 2.2 Diagram of logical file system organization for 
MS-DOS and DOS-C. 
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increasing cylinder number. This mapping is illustrated in Figure 2.3. 
The file system structure does not take up a full cylinder and is spread 
out over two tracks. 

Although not immediately apparent, this organization allows for the 
most rapid access when reading files sequentially. Each sequential sec- 
tor can be accessed immediately upon reading the previous sector. 
When the last sector on the track (sector 9 for a 360Kb disk) is read, the 
first sector on the next track can be immediately accessed on the start of 
the next rotation. Once all sectors for every track in the cylinder are 
read (two tracks for a 360Kb disk), the next cylinder is read by physi- 
cally moving the heads. This is important because there is a time pen- 
alty for stepping the disk heads on the order of several milliseconds. It’s 
also important that the sectors follow each other so that you don’t incur 
a penalty of a rotation of the disk before reading the next sector. This 
delay is also on the order of several milliseconds. Although the sectors 
are sequential on a floppy disk, some hardware is not fast enough, so 
the sectors are interlaced so that the sequential disk reads are completed 
in a Minimum number of rotations. This is known as sector interleave 
and is necessary on some smaller XT-type systems. 


MS-DOS API 


As I discussed earlier, Microsoft based the early MS-DOS API on the CP/M 
model. This model enters all call parameters into one or more registers 
and makes a system call, possibly to a fixed address. MS-DOS offers two 
such system calls. One address is cs: 05h. This system call is very nearly 
a duplicate of the CP/M system call. CP/M used the same address, and all 
of the initial MS-DOS v1.0 system calls (sometimes called the FCB sys- 
tem call, because of the reliance on File Control Blocks for disk I/O) are 
nearly identical to their CP/M counterparts. This model is pnmarily for 
the .comtype programs coded as small or tiny model programs. These 
program images also are similar to the CP/M-style executable image. 
Although the pnmary reason for this entry point is to accomodate .com 
programs, it is available to tiny- and small-model . exe programs as well. 
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Figure 2.3 Diagram of logical file system organizaiton 
mapped into a physical disk. 
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Another MS-DOS API entry similar to the CP/M style cs:05h is a 
call to 0:90h. As with the cs:05h call, this system call entry is for use 
by .exe-style programs that use medium and large models in their 
code. This entry point received little use, if any, and was actually incor- 
rect in some earlier versions of MS-DOS. 

By far the most used and widely accepted MS-DOS API is the int 2xh 
system calls. Unlike CP/M, the MS-DOS designers had software inter- 
rupts, such as int 21h, to work with. Using software interrupts causes the 
processor to push ip and Cs register flags on its stack. It then looks up a 
new vector by translating xxh (by multiplying it by four) and placing the 
long-word vector found at the previously computed memory location into 
cs:ip. This is a very fast call and 1s highly flexible. Microsoft reserved the 
range of 20h through 3fh for use by MS-DOS and derivative operating 
systems. Other interrupts are similarly used by services outside MS-DOS, 
such as the system BIOS and application programs. 

At the highest level, the Microsoft-reserved vectors of 20h through 
31h allow a maximum of 18 software interrupts. However, by using 
values in registers to further subdivide each call, the technique allows 
for 13 API entry points with almost unlimited function calls for each. 
Table 2.1 presents a summary of these calls. 

Through its development, MS-DOS was the recipient of three variet- 
ies of API call methods that use the software interrupt call technique. 
Each technique is simple to use and well documented. They do require, 
however, that the programmer become familiar with each and follow 
conventions. Otherwise, MS-DOS can break very easily. 

The first type of API call is the simple software interrupt. System 
calls int 20hand int 24h—29h are examples of this type of call, which 
has only one defined function for each system call. Examples of the sim- 
plest calls are the terminate calls (Terminate Program and Terminate and 
Stay Resident). Through these system calls, programs retum only simple 
status information to MS-DOS. Examples of slightly more complex 
calls are the Absolute Disk Read and Absolute Disk Write calls. As with 
the simpler calls, there is no modification to function behavior by the 
call, and the call performs only one function. These calls are only 
slightly more complex in that the system call returns to the caller after 
performing its function, allowing the application program to continue. 
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The second type of API call is characterized by multiple functions avail- 
able through a single int xxhentry. The primary DOS API int 21h is 
an excellent example. To select a function, load register ax with a func- 
tion number that corresponds to a unique system call. The application 
program loads other registers with variables applicable to the particular 
system call or pointers to an area of memory for system calls. MS-DOS 
then performs the operation and retums its results in registers or mem- 
ory pointed to by registers. 


| Table 2.1 DOS int System Calls. | 


| int Number Function | 
| 20h Terminate program | 
| 21h System calls | 
| 22h Termination address | 
| 23h Break address | 
| 24h Critical error | 
| 25h Absolute disk read | 
| 26h Absolute disk write | 
| 27h Terminate and stay resident | 
| 2Bh Idle handler | 
| 29h Fast console /O | 
| 2ah Network handler | 
| 2bh Not defined for DOS kernel | 
| 2ch Not defined for DOS kernel | 
| 2dh Not defined for DOS kernel | 
| 2eh command .com reload | 
| 2fh Multiplex interrupt | 
| 30h far jump DOS entry | 


31h Part of 31h | 
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The MS-DOS int 21h system call is also the best documented 
example of this type of system call. Close examination of this API 
reveals that many functions contain not only a major function number 
but a minor function number as well. This allows greater expansion of 
system calls and also allows for logical grouping of related system 
calls. The IOCTL system call is the best example of this. 

The third type of API call is the chained system call. Chained sys- 
tem calls used by MS-DOS extensions, such as installable file systems 
and network system calls, are similar to the techniques used to extend 
MS-DOS by third-party vendors. The multiplex interrupt, int 2fh, is 
an example of this call. When an application or MS-DOS uses a func- 
tion installed into the multiplex chain, it loads a major and minor func- 
tion number into the ax register. Each extension saves the call context 
and examines the ax register. If it matches, it performs the service and 
retums to the application or MS-DOS through an iret instruction. If it 
is not a function defined within the extension, MS-DOS restores the call 
context and passes control to the next extension’s entry point stored dur- 
ing installation of the extension. Should no extension handle the call, 
MS-DOS performs a default action and returms to the application. 

You need to study these API calls because a clear understanding of 
system call techniques will be very important later when you design 
your DOS-C API. Because the goal is to create a DOS-compatible 
operating system, you will need to respond to the same type of system 
calls in order to guarantee a level of compatibility. The exact mecha- 
nisms will be discussed in later chapters. 


MS-DOS File System 


The MS-DOS kemel supplies applications with a number of services. 
These services provide process management, memory management, file 
management, and miscellaneous services. Probably the greatest number 
of MS-DOS system calls relate or pertain to file management. This sec- 
tion explores the MS-DOS primary file system, the File Allocation Table 
(FAT) file system, and examines key details and data structures. 
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As I discussed earlier, the original MS-DOS purpose was to manage 
small, 160Kb floppy disks. Fixed disks with gigabyte storage capacities 
on a desktop computer and multimegabyte memory were unheard of. 
Instead, MS-DOS designers faced space restrictions both on disk and in 
memory. They also faced high rotational and head positioning latency 
times inherent to the 51/4-inch floppy disk drives. This forced MS-DOS 
designers to be creative in both disk layout and system call functional- 
ity. This creativity will be apparent as you closely examine the FAT file 
system. 

Begin the examination of the FAT file system by examining a simple 
DOS floppy disk that contains only a root file system and is representa- 
tive of a freshly formatted floppy disk. Its structure is flexible enough to 
be compatible with almost any removable media, regardless of size. 
Figure 2.4 illustrates a conceptual model of disk storage in an operating 
system compatible with MS-DOS. In MS-DOS, a logical disk is a 
sequential set of 512-byte sectors. The disk consists of a boot block, 


Figure 2.4 Floppy disk storage model. 
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followed by zero or more reserved sectors, and the file system. The 
boot and reserved sectors are for disk booting and disk maintenance, 
although the boot sector also carries file system information. 

Following the boot-related area is one or more FAT areas. Each FAT 
consists of one or more sequential sectors in which each entry consists 
of a forward pointer in a linked list of pointers. Values of zero indicate 
free space, and other values indicate bad sectors, disk size, and disk 
type. The need for more than one FAT stems from the greatest weak- 
ness of the MS-DOS file system, the FAT itself. If a program crash 
destroys a FAT, no other information anywhere within the file system 
allows you to recover your files. Because disks are probably the compo- 
nent in a computer with the lowest reliability, redundant FATs improve 
the file system reliability compared to a single FAT. 

Immediately following the FAT(s) is the root directory. In early 
MS-DOS days, the root directory was the only disk directory, similar to 
its CP/M cousin. All data relating to filename, location, creation date, 
file size, and file protection are contained within an entry in the direc- 
tory. Starting with MS-DOS v2.0, the file system became hierarchical 
with other directories that extend from the root directory. However, 
there are differences between a root directory and other disk directo- 
ries. The first significant difference is that the root directory does not 
contain an entry for “.” (itself) and “..” (its parent) directories. This 
choice allowed for backward compatibility with systems running pre- 
2.0 versions. Another significant difference is that the root directory is 
fixed in length, although a subdirectory can be expanded. Again, com- 
patibility with pre-2.0 versions forced this limitation. 

Hard drives are very similar to the diskette in design. The biggest 
difference is that hard drives can support multiple disk images, which 
in turn can support more than one operating system. This concept bor- 
rows the idea of multiple volumes from other operating system designs. 
It improves on the concept by placing the multiple-volume concept out- 
side the operating system and in control of the boot code. Each volume 
is called a partition and shares a single master boot record. 
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As with floppy disks, when an x86 system boots, it loads the first 
sector into memory. The difference between the floppy and hard disk is 
in the functionality of the boot code contained in this sector. The master 
boot record contains code that examines a partition table also contained 
in this sector. It determines the active partition and proceeds to load the 
first sector of this partition into memory. The remainder of the boot pro- 
cess is then exactly the same as for a floppy disk. By using this design, 
the file system remains identical for both floppy disks and hard disks. 

However, certain on-disk changes to the data structure are necessary 
when certain physical sizes exceed certain structure member sizes. This 
occurs on drives when the physical size begins to exceed a 12-bit 
encoding limit in the FAT and a 16-bit limit on the device drivers. 
These limits challenged MS-DOS but its designers were smart enough 
to extend to 16-bit FAT entries and 32-bit block addressing to overcome 
these hurdles. 

What you have seen so far is the principal MS-DOS file system 
architecture. Other file systems are supported by MS-DOS, such as the 
CD-ROM file system, but I will not cover them due to space limita- 
tions. However, I will further examine each of the file system compo- 
nents in detail so that you can design your DOS clone. 


Boot Area 


The first area of the disk, as seen earlier, is the boot area. This area con- 
sists of the boot sector plus zero or more reserved sectors. You already 
know that this sector executes first in order to start MS-DOS, and you 
know that every disk has one whether it is bootable or not. What you 
don’t yet know is what the structure of this sector is and how MS-DOS 
knows how to access a given sector on different disk geometries. 
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Figure 2.5 shows a diagram of the boot sector.' Three areas are 
shown (although this varies for Master Boot sectors). Every disk boot 
must contain these three areas. The first area, called boot sector, con- 
tains an important disk structure with all information about the archi- 
tecture of the disk. The second area is the body of the executable boot 


t File system structures are presented as diagrams in this book to avoid confusion. The 
MS-DOS design is for 80x86 processors and ignores portability issues. As a result, all 
data structures contained on disk reflect 80x86 byte and word order. Additionally, all infor- 
mation published by Microsoft reflects the assembly language roots of MS-DOS through its 
presentation as assembly language data structures. Later chapters address portability issues 
and techniques. All data read from and written to a storage medium respects MS-DOS con- 
ventions, but all in-memory data structures are native C structures for efficiency. 


Figure 2.5 Floppy disk boot sector diagram. 
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code. The final area is a signature word, 0x55aa, used to identify this 
sector as a valid boot sector. 

The earlier discussion centered on how MS-DOS boots. In it, I dis- 
cussed loading the boot sector into location 0:7c00h and transferring 
control to this address. I also spoke about various disk geometries. 


Figure 2.6 The boot sector data structure. 
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I will use the boot sector data structure, shown in Figure 2.6, to explain 
how this is possible. 

The first entry in this table is an area called bsJump, which is a 
reserved entry that contains an Intel 3-byte jump: either e9 xx xx abso- 
lute jump or eb xx 90 relative jump. Fortunately, the ugliness of the 
Microsoft design for Intel processors rears its head only with this entry. 
The sequences are actually Intel opcode for an absolute jump or a rela- 
tive jump followed by a no-op instruction. It does, however, explain 
how the disk information (in the boot sector area) placed ahead of the 
boot code does not interfere with the booting process since this entry 
points to the actual boot code entry address. 

The remainder of the data structure is much of what you expect from 
the boot area. The next entry, bsOemName, is 8 bytes allocated to an 
OEM name and version. This entry identifies who formatted the disk, 
not which operating system is currently running. It is on the disk for 
information only and serves no other purpose. 

Starting with the next entry, the boot sector duplicates the MS-DOS 
BIOS Parameter Block (BPB). The first BPB entry, bsBytesPerSec, is 
a 16-bit entry that specifies how many bytes are in a sector. This is typ- 
ically 512, but can be different. 

The next area, bsSecPerClust, indicates how many sectors are in a 
cluster. Although I have not discussed the cluster yet, this is where you 
first encounter it. A cluster is a group of sectors used as a fundamental 
unit of storage allocation and serves two purposes. First, it places a 
level of indirection between the operating system and the physical disk 
media format. This allows changes in disk and operating system tech- 
nologies to be mutually exclusive. Second, it allows disk sizes to grow 
much larger than would be possible with just a sector allocation 
scheme. An example is a typical hard disk allocation scheme. These 
disks typically use 4096-byte clusters, allocating 8 sectors per cluster. 
If the addressing limit is 16 bits, the limit is 33Mb of storage using sec- 
tor allocation but increases to 268Mb using an 8Kb cluster allocation 
scheme. With 32 bits, sector allocation yields approximately 2200Gb 
and |17,600Gb respectively. 
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The entry bsResSector specifies how many reserved sectors are on 
the drive or disk, including the boot sector. This entry is useful for both 
the operating system and the boot program. As you will see later, the 
operating system uses this entry in computing the location of key disk 
data structures. Boot can use this entry to compute how many addi- 
tional sectors to load into memory for more complex booting schemes. 
An example of such a scheme may be booting multiple operating sys- 
tems. It is possible for boot to load a secondary, more complex boot 
program that would then proceed to present a menu and load the 
selected operating system. Other schemes are also possible, limited 
only by the developer’s imagination. 

Jumping ahead, bsMedia is the media descriptor. This scheme was 
an early attempt at automated identification of the media type that is 
present in a drive. MS-DOS passes this byte, by specification, to the 
device driver. Originally, this entry switched a drive from low to high 
density. It is, however, an incomplete scheme where the same byte 
appears to specify more than one type of media to the same device 
driver. Although Microsoft may use it, I will only pass it to the device 
driver for compatibility purposes. 

The entries bsFATs, bsRootDirEnts, bsFATsecs, bsSecPerTrack, 
and bsHeads, when combined with bsResSector, are used to translate 
a logical sector number to a physical sector location. For example, 
to get to the primary FAT disk logical sector, start at logical sector 
bsResSector (logical sectors start at 0). To get to the secondary FAT 
logical sector, start at bsResSector + bsFATSecs. An interesting note 
is that more than two FATs are permitted on any disk, implied by 
bs FATS. So, if you want to get to the nth FAT, start at 


bsResSector + (n#bDsFATSecs) 


resulting in the generalized FAT algorithm. 
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From Figure 2.4, you saw that the root directory followed the FAT 
areas. You can compute the start of the root directory with the algorithm 


bsResSector + (DSFATS*bDSFATSecs). 


Finally, any data sector you need to find starts at the logical sector 


log sector = bsResSector + (DSFATS*bSFATSecs) 
+ (32*bsRootDirEnts/bsBytesPerSec). 


Because I have been talking about logical sectors, you will later see 
that this is the method MS-DOS uses to convey a desired sector to the 
device driver. The de facto DOS convention for physical translation is 
the sequential cylinder model, where each cylinder is composed of a 
number of tracks, and cylinders increase with logical sector numbering. 
To translate a logical sector to a physical cylinder, track, and sector 
model, the cylinder number is computed as the integral computation 


log sector/(bsSecPerTrack*bsHeads). 


The track number is computed by 


log sector mod (bsSecPerTrack*bsHeads )/bsSecPer Track. 


Finally, the physical sector number is 


((log sector mod (bsSecPerTrack*bsHeads)) 
*mod bsSecPerTrack ) +1. 


(Physical sector numbering starts at 1.) 

There are only two other key entries, bsSectors and bsHugeSectors. 
These entries are used for error checking by the device driver. If a logical 
sector number is greater than or equal to bsSectors or bsHugeSectors, 
then an error condition exists. The reason for two separate entries is 
historical. Originally, bsSectors was the only entry governing disk size. 
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It is limited to 16 bits, which is insufficient for larger hard disk drives. Dur- 
ing its evolution, MS-DOS gained a second entry to handle these larger 
drives, bsHugeSectors. The method used to decide which entry to use is a 
simple algorithm: look at bsSectors, if it is zero, use bsHugeSectors, 
else use bsSectors. 

The remaining entries, bsBootSignature, bsVolumeLabel, 
bsReservedl, bsVolumeID, bsDriveNumber, and bsFileSysType, 
deal with disk volume identification or are internal data structure mem- 
bers. The remaining sector, bsBootSignature, is used in other calls, 
and all of the remaining data structure members are informational only. 

bsHiddenSecs is an additional entry that allows MS-DOS to hide, 
or reserve, additional sections that are hidden from the file system. 
However, bsHiddenSecs is rarely used. 


FAT Area 


Interestingly enough, the FAT is probably the best-documented area in 
all of MS-DOS. Frankly, I don’t know why, but I will review it here in 
order to present a complete MS-DOS file system discussion. For those 
of you who find this boring, skip ahead. 

All files contained on the basic DOS file system are composed of 
three distinct parts: a directory entry, a linked list in a FAT, and a collec- 
tion of disk sectors that contain the information in the file. I will discuss 
the directory entry later, but for the purposes in this section, the directory 
entry contains the means by which MS-DOS identifies the file by name 
and by a number used as an index into the FAT array. For example, when 
MS-DOS opens a file for read access, it finds the file by matching the 
filename and retrieves the starting FAT index. In essence, this is the only 
action necessary to open a file, although MS-DOS maintains other inter- 
nal data structures that are also updated. 

The FAT and the data area have a one-to-one relationship. This rela- 
tionship allows MS-DOS to keep track of the data clusters in an efficient 
manner. By definition, each FAT entry corresponds to its equivalent 


40 — The FreeDOS Kernel 


sequential cluster. For example, any reference to cluster number 3 also 
refers to FAT entry number 3. The FAT functions as a linked list. If you 
want to find the cluster that follows cluster number 3, you look at the 
FAT entry number 3 and get its content. This number now gives the next 
data cluster number. Figure 2.7 illustrates how this works. 

In the example, you see two files. The first file, autoexec.bat, is 
probably very close to one you may have on your boot disk. When 
MS-DOS opens the file, it first goes to the directory and begins a lin- 
ear search for the filename autoexec.bat by looking for a string 
match. When MS-DOS finds a string match, it performs some internal 
updates to the file table and picks up the entry for the starting disk 
cluster, in this case 2. An interesting note about this entry, all clusters 
on a disk begin with cluster number 2, with O reserved as a free clus- 
ter indicator and | reserved for MS-DOS. This is important when you 
begin your design. 

If you now instruct MS-DOS to read the file sequentially, you will 
read all sectors in cluster number 2 first. MS-DOS does this in one of 
three ways, depending on the number of bytes requested and the current 


Figure 2.7 MS-DOS FAT read example. 
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position of the MS-DOS intemal file pointer. In the first case, when you 
read less than a whole sector and the file pointer is at an arbitrary posi- 
tion within the file, MS-DOS reads a sector from the cluster into a buffer 
and transfers the requested number of bytes to memory. In the second 
case, where the file pointer is at the start of a sector boundary and the 
requested number of bytes is a multiple of a sector, MS-DOS transfers 
the data directly into memory. In the third case, the internal pointer is at 
an arbitrary position and the requested data transfer is greater than a sec- 
tor. In this case, MS-DOS performs a combination of actions of the first 
two cases and continues until a new cluster is needed. 

When MS-DOS begins to read the next cluster, it needs to find its 
physical location on disk. It does this by looking at the FAT in the entry 
corresponding to the current cluster number. The number it reads corre- 
sponds to the next entry to be read. In this fashion, each FAT entry cor- 
responding to a physical cluster forms a forward link for the 
corresponding file data area. This process continues until MS-DOS 
encounters an end-of-file marker. The example in Figure 2.7 shows this 
for the third cluster of autoexec.bat. The sequence of clusters is 2, 3, 
and 4. FAT entry 4 contains an ff fh, which signals MS-DOS that this 
is the last cluster of the file (by definition, any entry in the range of 
ff8h through ff fh indicates the end-of-file). Table 2.2 lists all possible 
values contained in a FAT entry and their meanings. One note — this 


| Table 2.2 FAT entry values and meanings. | 





Value Meaning | 
| «0 )000h Free cluster | 
| 0)001h Not used | 
| (0)002h-(f)fefh Data cluster indices | 
| (f)ffOh-(F) FF6h Reserved | 
| (f)ff7h Bad Cluster, used for bad sector mapping | 


(f) ffBh-(f) ff fh Last cluster | 
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Figure 2.8 MS-DOS FAT write example. 
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discussion uses 12-bit values, which correspond to the type of FAT typ- 
ically found on floppy disks and small hard disks. You quickly run out 
of range on larger hard disks, so a 16-bit FAT is used for these. This dis- 
cussion holds true for 16-bit FATs, just sign extend the numbers. 

For a write, the operation is somewhat more complex because file posi- 
tion plays a big part in how write works. However, you can examine some 
fundamental MS-DOS FAT principals for write with a simple sequential 
file create and write example. Figure 2.8 illustrates this example. 

In the example, you start by creating example.txt, the example file. 
First, MS-DOS searches for a free directory entry to begin its alloca- 
tion. It does this by searching for either a 00h or Oe5h for the first char- 
acter of the entry (I will discuss their meaning in the next section). 
When it’s found, internal file data structures are updated and the opera- 
tion is complete. You can now begin the write operation, and this is 
when the FAT fun starts. As soon as you do a write, the very first cluster 
allocation occurs. In the example, cluster 5 is the first available cluster 
and is allocated to our example file. When you have written exactly one 
cluster size plus one byte, the next cluster is allocated. In the example, 
cluster 6 is occupied by a cluster belonging to sma]1.txt, so the search 
retums cluster 7. The FAT forward link entry for cluster 5 now contains 7 
and the writing continues until you finally close the file. In the example, 
this occurs somewhere in the third cluster. MS-DOS updates its data 
structures and places an EOF indicator in the FAT forward link entry, 
relative cluster 3 or physical cluster 8. 

I have been very careful in discussing the write operation by avoid- 
ing all references to time in my discussion. As MS-DOS matured, it 
seems that the timing of the actual physical write to disk changed. Also, 
I wanted to keep the discussion very generic because in the later design 
you will want to add some robustness to DOS-C, and you will see that 
the method used to update the physical disk dictates exactly how much 
of a file is recoverable in the event of a hardware or power failure dur- 
ing the write operation. 
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Directory Area and Directories 


The last part of the FAT file system discussion is the directory area. 
I have already covered both the data area cluster mapping and the FAT 
forward link and shown how an MS-DOS file is composed of three 
components. The directory is the glue that holds the file together. In 
MS-DOS, there are two types of directories, the root directory and sub- 
directories. I draw this distinction because there are some subtle differ- 
ences between them. 

The first distinction is that every disk must have a root directory. The 
root directory is the starting point for all directories and files. Whenever 
a search down any path is started, starting from the root directory is 
guaranteed to get you to the exact file or directory. 

Another difference is in the contents of the root directory. A root 
directory contains entries not only for files and subdirectories but also for 
the volume label that appears at the top of a dir command output. It 
is also why the volume label has a limit of 11 characters: It must fit into 
the space normally reserved for filename (8 bytes) and extension (3 
bytes). Additionally, it also does not contain an entry for “.” (self) and “..” 
(parent), whereas all subdirectories must always contain these entries. 

The final distinctions between the root directory and subdirectories 
are location and size. The root directory consists of a sequential set of 
sectors Starting at a fixed location on the disk, whereas a subdirectory is 
actually a special MS-DOS file that can only be accessed by MS-DOS 
itself for both read and write. The size of the root directory is fixed by 
an entry in the boot area, and subdirectories can grow in much the same 
way a file can. As you will see later, these differences are critical in the 
design of directory handlers within DOS-C. 

Having noted these critical differences, I can now state what is com- 
mon in all types of directories. Any MS-DOS FAT file system directory 
is stored as a linear array of directory entries. The rules for use of a 
directory entry are common for all file types. Figure 2.9 illustrates a 
directory entry structure. 
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There are not very many directory entry fields, reflecting the simple 
function that a directory entry performs within MS-DOS. The directory 
entry performs four functions: name the file, specify access rights, hold 
time stamps, and specify the file location and size. 

When it comes to filenames, MS-DOS borrows from its CP/M heri- 
tage by borrowing the filename and file extension method. MS-DOS 
limits the filename to an 8-byte entry and the extension to a 3-byte 
entry. Each entry is left justified and space filled. Hence the file ‘“‘abc”’ is 
stored as “ABC UU #tnedle “B.C? is Bice au, CW”, and the 
file “FileName.Txt” is “FILENAMETXT”. There are no typographical 
errors in the previous example; all filenames and extensions are con- 
verted to upper case by MS-DOS, further simplifying filename match 
algorithms. One final note, an unused field is denoted by a 00h and a 
deleted file by a Oe5h in the first location of deName. This distinction is 
important, because MS-DOS stops its linear search in a directory when 
it encounters OOh. It is an interesting way of making files disappear on 
an MS-DOS disk. 


Figure 2.9 Directory entry structure. 
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The deAttribute entry restricts file access. In this field, a set bit 
indicates that an attribute is active. Table 2.3 lists these attributes. One 
interesting note — the ATTR_ARCHIVE bit is actually a “modified” bit 
set, whenever MS-DOS does a write operation to the file. In this way, 
MS-DOS knows that a file with this bit set needs to be archived. 

For people who are familiar with UNIX and other operating sys- 
tems, these attribute bits and file restrictions seem primitive. In essence, 
they are sufficient for a small, personal system. I will revisit these bits 
and their meanings later in the design of DOS-C. 

Time stamping is another function of the directory entry for which 
MS-DOS uses the last file access that modified the file to generate the 
date and time fields. The choice of encoding method was the preference 
of the operating system architect. Unlike UNIX where the time stamp 1s 
seconds from an epoch, the MS-DOS time stamp consists of both date 
and time. Like UNIX, the MS-DOS time stamp is 32 bits wide, broken 
into a 16-bit date field and a 16-bit time field. These two fields are fur- 
ther broken down into bit fields composed of the logical parts of the ele- 
ment, such as hours: minutes: seconds and year: month: day, as shown in 
Figure 2.10. Time stamps will become a critical data item to update 
when I visit functions such as create, write, and close later in our design. 


| Table 2.3 Attribute bits of the deAttribute entry. | 


| Mnemonic Value Meaning 
| ATTR_READONLY Olh A read-only file 


| ATTR_HIDDEN 02h Hidden or directory (can’t delete) 


| ATTR_SYSTEM 04h System or directory (can’t delete) | 
| ATTR_VOLUME OBh Volume label, not a file | 


| ATTR_DIRECTORY 10h A directory 


| ATTR_ARCHIVE 20h Modified file | 
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The final ties, and probably the most important, are the file location 
links. These fields link the FAT and data clusters to the directory entry. 
The field deStartCluster is the FAT link to the first FAT entry. This is 
in essence the anchor for the file. The entry deFileSize completes the 
file description by specifying the size of the file in bytes. With this 
entry, the file becomes independent of file system architecture or cluster 
size. It does, however, limit physical file sizes to 4.3Gb. 

The design of the directory is pivotal in any program that manipu- 
lates a FAT disk, and it is no exception in DOS-C. Design tradeoffs 
concerning the timing of directory updates will affect file system 
robustness. There are also design issues regarding when to switch 
between 16-bit and 32-bit FAT tables, etc. I will examine these in later 
chapters and use the information presented here during the design. 


Figure 2.10 Date and time format used in the directory entry. 
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MS-DOS Memory Management 


In all except the simplest operating systems, there exists one form or 
another of memory management. The reason is simple: memory is yet 
another computer resource given to the operating system to maintain. 
How an operating system performs this management varies drastically 
among systems. For example, large operating systems, such as UNIX, 
provide memory management in multiple forms. One is for program use 
as part of a heap management scheme. Another is for the program itself, 
either maintaining it for program loading or dispensing it in a fashion that 
gives the program the illusion of much more memory than is physically 
available. MS-DOS manages memory using a much simpler scheme but 
still manages program heap requirements and program loading space. 

I must admit that the developers of MS-DOS did the best they could 
with a poor architecture. In many computers other than those based on 
the original PC and XT architecture, there is some form of a hardware 
memory management unit, which provides two functions: protection 
and relocation. On these machines, protection is achieved by disallow- 
ing writes to occur based on some table, either in memory or within the 
memory management unit itself. Typically, the write protection can be 
specified over portions of memory. This capability helps make these 
other machines more resilient to occurrences of wild pointers, patho- 
logical programs, etc., protecting other programs and the operating 
system in the process. It is much harder for application programs to 
crash these machines. Relocation allows the operating system to move 
physical memory around and make it look as though it is contiguous 
logical memory. Memory becomes available to a program in either a 
straightforward manner for allocation or on demand for virtual memory. 

At the time of the development of the PC, the desktop computer 
market was based on simple systems running with simple microproces- 
sors. When IBM decided to enter the market, it decided to emulate the 
simpler computers rather than introduce proven technology that it had 
used on its mainframes and minis. This unfortunate choice has always 
been a weak spot for MS-DOS, because it suffered attacks from critics 
for being vulnerable to crashes. 
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The developers of MS-DOS did, however, introduce some forms of 
memory management through the use of a memory block linked list 
that consists of a control header, known as a memory control block or 
arenaheader, and the allocated (or free) memory. (By now, it should be 
obvious that the linked list is the data structure of choice for MS-DOS.) 
To their credit, they did the best with what they had, and the MS-DOS 
memory management scheme does provide a fair degree of control. 

Figure 2.11 illustrates the MS-DOS arenaheader. Throughout the 
history of MS-DOS, the arenaheader has been referred to as the MCB 
or Memory Control Block (they are the same data structure). This data 
structure is in front of each allocated and free memory block. In the first 
640Kb, these control blocks contain an “M” in arenaSignature for all 
headers except the last one, which contains a “Z”. The next field, 
arenaQwner, contains the segment value of the PSP (Program Segment 
Prefix) when it belongs to a user or contains a 0008h if it belongs to 
MS-DOS. This value identifies the block’s owner and is used to dis- 
pose of memory when a program terminates. The next entry, arenaSize, 
is used to indicate the size in paragraphs of the memory block. 
arenaReserved is currently unused within MS-DOS and is reserved 
for potential future use. The final area, arenaName, contains either the 
name of the program that it belongs to, or a special name such as 
‘‘UMB” to indicate a special arena. When MS-DOS divides a special 
arena into subarenas, it applies the same arenaheader and manage- 
ment techniques to these areas. 


Figure 2.11 The arenaheader data structure. 
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MS-DOS also provides a number of API calls to manage the arenas. 
The basic management provided is simple allocate and dispose. Calls 
are also used to set allocation strategy, such as first fit, last fit, and best 
fit. Additionally, there are calls that detect and manipulate the upper 
memory blocks. 

MS-DOS uses special arena calls to further subdivide an arena when 
used for special MS-DOS applications. One special arena usage is for 
special storage allocations specified in config.sys, such as new 
stacks, File Control Blocks (FCBs), and device drivers. Another special 
arena usage is to identify and subdivide the upper memory blocks, 
which MS-DOS uses to load device drivers and the kemel itself into 
memory above the conventional 640Kb limit. 


MS-DOS Task Management 


Task management in MS-DOS is very simple. There are only three 
methods of loading a program: load and execute, load, and load over- 
lay. Readers who are familiar with other operating systems may find the 
limited selection a surprise, but they represent the minimum set neces- 
sary for a small operating system. When combined with the Terminate 
Program and Keep Program system calls, the load methods offer a pow- 
erful set of system calls. 

I want to point out to the reader that although MS-DOS offers a 
small selection of system calls that handle program execution, there is a 
great deal of work that MS-DOS must perform for these calls. Examine 
the Load and Execute Program system call as an example. The first 
thing that MS-DOS does is look for the executable file and identify 
what type of executable file it is: .comor .exe. It does this by examin- 
ing the first two bytes. If it sees the correct . exe signature, it proceeds 
to load the file and do the correct relocation by correcting all segment 
references contained within the file. Otherwise, it assumes that it’s a 
.com file and just loads the file directly into memory. It then creates the 
PSP and populates it with all inherited file handles, updates the termi- 
nation address, Ctrl-C handler, and critical error handler. Next, the 
Starting address is determined, a program stack is allocated, and 
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machine registers are loaded. At this point, control is transferred to the 
loaded program. Much of the same effort goes into the Load system 
call. The only difference between the two is that transfer of control does 
not take place for Load. Another method is Load Overlay, but this only 
loads a program without changing the PSP or special handlers. 

When program execution completes, two system calls are available: 
Terminate Program and Keep Program. There is considerable differ- 
ence between these two calls. Both system calls store a program retum 
value available to the parent program, but they soon differ in function. 
The Terminate system call proceeds to undo everything done by the 
Load system calls. It must begin by replacing the termination address, 
Ctrl-C handler, and critical error handlers with those of the parent and 
go through the PSP and operate on the child’s file handles. After all this 
is complete, it returns the program’s memory back to the memory pool, 
effectively destroying the program image. Control is then transferred 
back to the parent program and execution continues within the parent. 
In contrast, the Keep Program system call must omit destroying the 
image, closing files, etc. The reason for this is simple: Whereas other 
operating systems have the concept of daemons built into them — for 
example, UNIX uses daemons for all of its networking, cron program 
scheduling, line printer functions, and other tasks — the only way to 
accomplish a similar task in MS-DOS is to terminate the program and 
keep its image in memory. The technique is known as TSR or Termi- 
nate and Stay Resident. It is for this reason that Keep Program must 
keep the image and associated data structures, such as the PSP, intact, 
since the program may be triggered by an external event. With this in 
mind, the differences become clear. Keep Program only stores the 
return code and returns control to the parent program. This guarantees 
that the program image remains intact. 
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MS-DOS Device Drivers 


The MS-DOS device drivers are simple in design but difficult to imple- 
ment in C, not because the code is complex, but because the data struc- 
ture used to communicate with the device driver is actually designed 
for assembly language and results in a very complex union to imple- 
ment. The major contributing factor is that the structure implements a 
common beginning for control information, but the following data 
areas vary for each call. I will examine the mechanism used to control 
driver entry points and a portion of the common data structure. 

The design of MS-DOS device drivers is very much tied to the x86 
architecture. If you examine Figure 2.12, you will see two sets of point- 
ers that are different in size. The first pointer, dhl ink, links together all 
the device drivers used by MS-DOS, both internal and loaded through 
config.sys. The other two pointers, dhStrategy and dhInterrupt, 
are Offsets into the two driver entry points. For these calls, the device 
driver segment is assumed to be the segment for the call, restricting the 
device driver size to one segment. 

The two names of the entry points, dhStrategy and dhInterrupt, 
suggest device drivers similar to those used in multitasking operating 
systems. Unfortunately, when you examine the way the typical MS-DOS 
device driver is written, those names are very misleading. In other oper- 
ating systems, the dhStrategy entry would queue up a request and 


Figure 2.12 The deviceheader data structure. 
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possibly start the transfer, and dhInterrupt would be the interrupt 
handler that might terminate the data transfer. MS-DOS designers 
never did get to complete this. In MS-DOS, dhStrategy 1s called first. 
When it is called, the device driver usually just stores the request 
pointer. The driver entry, dhInterrupt, is then called immediately 
afterwards where the actual driver call is implemented. 

The requestheader is the common data structure that prefixes all 
device driver requests (Figure 2.13). When the device driver is called, a 
far pointer to the request is passed to the device driver. Appended to 
the request area is a variable data structure that is unique for each sys- 
tem call. Because of this variability, rnLength indicates the length, in 
bytes, of the combined requestheader and unique data structure. To 
allow the device driver to handle more than one device, rhUnit speci- 
fies the device number that the driver is to operate on. The entry 
rhFunction specifies the function, such as read or write, requested of 
the driver. Finally, when the operation is complete, rhStatus returns 
the completion status. The entry rhReserved ts unused at this time and 
is reserved for future use. 

Of the remaining deviceheader entnes (Figure 2.12), the dhAttributes 
and dhNameOrUnits entries are used for informational purposes. 
dhAttributes is used to convey information to MS-DOS about the device 
driver. It is a bit field that indicates whether the device driver is for a block or 
character device, whether it is stdin or stdout, etc. The final field, 
dhNameOrUni ts, contains either the name of the device if it is a character 
device or the number of units handled by this device for a block device. 


Figure 2.13 The requestheader data structure. 
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Unfortunately, you need to leam many more details before attempt- 
ing to write a device driver. This task is beyond the scope of this book, 
but the topic is well documented by Microsoft and other publishers. I 
highly recommend that you refer to these sources before writing your 
own device drivers. I will go into the details of each device driver later 
in this book and cover what calls are handled and how. 


What I Am Not Covering 


As you can see by now, even a small desktop operating system like 
MS-DOS can be complex. You have seen that the API has different 
variations, a file has two parts hidden by the operating system, etc. 
However, there are parts of MS-DOS that I am not covering because I 
am not implementing them in DOS-C. I am not covering installable file 
systems because of DOS-C design limitations. Also, I will be taking 
liberty with internal data structures and error handling. With this infor- 
mation in hand, I will start the DOS-C design and implementation. 


Chapter 3 


Bird’s-Eye View 
of DOS-C 


Previous chapters examined the architecture of MS-DOS and looked at some 
details of the operating system. They also created a model of MS-DOS and 
defined the minimum set of programs that form MS-DOS. This chapter will 
take a look at the overall architecture and the major components of DOS-C 
and will show its relationship to the MS-DOS model. This chapter will also 
examine the dynamics of booting DOS-C and explain the design behind the 
techniques used. 

One major difference that you will see throughout this chapter 
is that the DOS-C design does not worry about smaller systems. This is 
really a reflection of the times. DOS-C expects to work in systems with 
at least 256Kb of memory, whereas MS-DOS expected only 64Kb 
(although the current kermel is much larger than the minimum memory 
requirements of MS-DOS v1.0). Also, you no longer need to worry 
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about saving disk space for small floppy disks because you can expect 
systems that have over seven times more storage than the original PC 
disks. Finally, I want to use a high-level language in the implementation 
so I won’t waste time trying various assembly language techniques to 
minimize size or maximize speed. As you will see in the final keel, the 
penalty you pay for the convenience of rapid coding in a higher level 
language is far less than some programmers would like you to believe. 
So with this in mind, start examining the architecture of DOS-C. 


Basic DOS-C Architecture 


Chapter 2 outlined an experiment to determine the minimum set of pro- 
grams necessary to run MS-DOS to limit the scope of the project, since 
individual programmers have to learn by example without the resources 
of a megacorporation with 200 programmers available to tackle the job. 
The result: MS-DOS needs, as a minimum, boot, 10. Sys, msdos.sys, 
and command .com to operate. These four components are the necessary 
set for MS-DOS, but this chapter will re-examine each component’s 
functionality and broaden the definition to improve upon the design. 

The boot sector is the first sector placed onto the diskette. You 
already know that all PC-style computers load and execute the boot 
sector in order to boot any operating system. As a result, you don’t have 
much latitude in the design for boot. Also, with less than 512 bytes to 
use for the boot code, you won't be performing many miracles. We do 
have a fair amount of freedom in the next stage. 

As you may recall, boot loads i0.sys, which proceeds to initialize its 
internal device drivers. i0.SyS continues to initialize its device drivers 
by loading ms dos .Sys and instructing it to initialize itself. io.sys then 
uses msdos.sys to read config.sys, if present. When io.sys com- 
pletes the processing of config.sys, it loads command.com and finally 
transfers control to it, allowing autoexec.bat to execute and ending 
up at the command line prompt. This is a good description of what 
MS-DOS uses, but you need to broaden the definition somewhat so 
that you can get a better picture of exactly what is happening. 
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If you step back and re-examine the operations, you can make some 
generalizations that will help define what you will do. You saw that 
boot loads a program that loads the full kernel. This loaded program 1s 
aware of a limited set of file system functions and knows how to read 
executable files into memory. If you take this generalization and look 
back into some computer science texts, you will find this to be the 
description of an Intermediate Program Loader or IPL. You will use this 
description of stage two of the booting process. You can now state that 
for DOS-C, the second stage is an IPL whose function is to load the 
final kernel. 

You also see from the previous description that when stage two is 
complete, the complete kernel has been loaded into memory and it has 
initialized its internal data structures. Again, if you generalize this 
description, you can say that the IPL loads the kernel and transfers con- 
trol to it. This is now stage three of the boot process. 

The last part of the previous example then shows that command .com 
is loaded and control is transferred to it. Once again you will step back 
and generalize this stage. What you can say is that the kernel, after ini- 
tialization, loads a user-level program that eventually brings you to the 
user interface. 

These four stages of booting also give nice logical breaks for defin- 
ing the four fundamental components. As stated earlier, the first compo- 
nent is boot, since its behavior is largely dictated by the computer 
architecture and not the operating system itself. The second component 
is an IPL that 1s aware of the file system and the structure of the kernel 
and loads the kernel into memory. The third component is the kernel, 
responsible for all the system calls and resource management of the 
operating system. The final component is the command line interpreter. 

Now take these four components and make some decisions about the 
nature of each. You know that the boot must be in the first sector of the 
disk and that it must contain certain data structures. It is a binary image 
and will utilize the BIOS for communications. It will look for the IPL 
and load it into memory. 

The IPL will be aware of the file system and be capable of loading 
an executable image into memory. If you look at this simple descrip- 
tion, you see that you can say that the IPL is a specialized version of the 
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operating system. This is an excellent way to design it, since you can 
concentrate your efforts on the kernel itself and use the C compiler’s 
preprocessor to conditionally exclude unnecessary portions of the ker- 
nel. Because you know that boot is limited in capability, you will make 
the IPL a binary image easily loaded into memory and call it ip] .sys. 
You can also take advantage of its derivation from the kemel and give it 
the capability to load . exe files, yielding tremendous flexibility in the 
requirements for the kernel. You can load one or more files into mem- 
ory, relocate them dynamically during the load process, and save devel- 
opment time since you will need to copy only one or more .exe files to 
the boot diskette in order to change the kernel. 

For DOS-C, you will limit the kernel to a single file, kernel .exe, 
so that you will not need to get fancy with the development process. It 
will be a single executable, with full relocation capability, and you can 
generate it using standard DOS compilers and linkers. It will initialize 
itself and load the user-level programs to start the user interface. 

Because you are modeling MS-DOS, the final component will bor- 
row from MS-DOS and will be command.com. You do not necessanly 
need to do this, as you could have loaded an . exe version of command or 
an init similar to UNIX that brings the operating system to a multi-user 
state. However, so many programs have been written with the concept 
that the command line interpreter is loaded next and that it is called 
command .com that changing this step may only complicate things. So 
you will follow closely MS-DOS and end with command .com executing 
autoexec.bat to give the user interface. 


Booting DOS-C 


You have generalized the MS-DOS boot stages and the basic partition- 
ing of the necessary operating system functions into DOS-C files. 
Examine the dynamics of booting DOS-C. Figure 3.1 shows the boot 
stages for DOS-C. Four distinct stages are involved in the boot, corre- 
sponding to the four minimum components necessary for DOS-C. This 
approach simplifies the design through modularization. By confining 
each stage to its own unique module, interaction is confined to the 
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interface layers. A bug that may exist in kernel .exe does not require 
changes in ip].sys nor will any change in size of either ip] .sys or 
kernel .exe affect each other. 

The computer’s ROM BIOS initiates the first stage. This stage is 
identical to the way almost any PC-compatible operating system starts 
out. The BIOS fixes the location of boot in memory and the initial 
cs:ip. For MS-DOS, IBM placed the location of boot at 0000: 7c00h 


Figure 3.1 The four stages of DOS-C boot. 
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to allow 30Kb for loading io.sys and msdos.sys. DOS-C does not 
need boot to load in this location because all of the following stages are 
either position independent or relocatable, and you can judiciously jug- 
gle the following stages for any size kernel. 

As a neutral choice, load the next stage, ip] .sys, into memory at 
1000:0000h. There is no magic to this address: it can be almost any- 
where. You can dynamically locate ip].sysS anywhere in memory 
because you have built it from the kernel source code using the x86 
small model. This limits the total size of ip].sys to only 64Kb. By 
limiting the scope of the IPL, you can easily fit its functionality into 
this memory size. 

ipl.sys loads kernel .exe and is aware of MS-DOS file systems. 
However, its lifetime is short and its purpose is simple: load the kermel. 
In order to reduce development time, it is logical to derive the IPL from 
the kernel source by reducing the kernel functionality to only what is 
needed to load the kernel. You will see how I have derived ip! .sys 
from the kernel source code when you examine the source code, but for 
now I will outline what you don’t need. For one thing, you won’t need 
any sort of API since you won’t have any other program using the inter- 
nal functions. You can also reduce the size of ip] .sys by eliminating 
functions you don’t need. For example, because it is strictly a loader 
there is no need for functions that write or delete. You only need func- 
tions that open, read, and close. You can further simplify ip] .sys by 
restricting it to only one loader. You can always switch to the other or 
include both if you decide to do so in the future. Realistically, if you 
choose the .exe loader you are not restricted to a programming model, 
and therefore this is the choice for the design. Also, you don’t need all 
of the device drivers. It does not make sense to have a line printer or 
serial port driver if the primary function is booting. 

You also should recognize the potential behind an IPL boot process. 
You can design a debugger or menu system into the IPL to choose the 
program you intend to load. Either feature enhances the debugging pro- 
cess, since you can stop after loading the kernel and start debugging, or 
you can decide to change the name of the experimental kernel and 
select it at runtime. 


Bird’s-Eye View of DOS-C — 61 


Stage three starts once the kernel is loaded into memory. The kermel 
Starts at main( ), just as any C program will, and begins by initializing 
itself. This includes locating all of memory, building arenas, and load- 
ing the proper addresses into the interrupt vectors. After initialization, 
processing config.sys can occur, but you won’t do that for this ver- 
sion. Finally, the kernel searches for and loads command. com into mem- 
ory and makes this the granddaddy of all programs. If command.com 
returns, the kernel will shut down and you will need to reboot. This is 
no different than the case where MS-DOS cannot find command.com 
and issues the error message “Bad or missing command interpreter,” 
except that the approach is cleaner. 

When command.com is successfully loaded, stage four commences. 
command.com does its own initialization including creating the initial 
environment and executing autoexec.bat. Once this has occurred, 
command .com issues a prompt and goes into a loop awaiting user input. 
DOS-C is now loaded and ready to run. 

I feel that it is always a good idea to include visual indicators in the 
code so that you can monitor the progress of an operation. When you 
boot DOS-C, the first indicator you will see is from boot. Because boot 
is small, it doesn’t do very much except announce itself by placing the 
message “boot” on the screen. Although it doesn’t seem like much, 
many times when I didn’t see this little message, I immediately knew 
that I had an error to find. 

ipl.sys announces that it is booting DOS-C, and the modified .exe 
loader outputs “.” at regular intervals during the loading process. 
ip].sys is bigger than boot and can afford to be somewhat more 
verbose. The reason for two visual indicators is twofold. First, you 
need an indicator that ip] .sys has been successfully loaded, the same 
as boot. Second, other things can go wrong with ip].sys, so a visual 
indicator of the load process helps identify errors and the choice of 
putting a “’.” to the screen at strategic locations is very helpful. Finally, 
kernel .exe and command .com both announce themselves when loaded 
but do not contain progress indicators because the debugging technique 
used on them is different and gives better information than a visual 
indicator does. 
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Now that you have conceptually loaded DOS-C into memory suc- 
cessfully and have it operational, you can look a little closer at its orga- 
nization. Next examine the design of the DOS-C kemel, and see how 
you can organize the code into logical modules, each implementing a 
logical system function. 


DOS-C Kernel Architecture 


The DOS-C architecture takes a layered approach. This approach is not 
very different from other operating systems. The only deviation you 
will find is that there is a decoupling between the upper and lower lay- 
ers of the operating system. 

Figure 3.2 shows the architecture of DOS-C. Because of the need for 
compatibility with the de facto DOS standards, full compliance is nec- 
essary in two interface areas. The first interface area is obvious — the 
DOS API entry points. These are the entry points that you expect from 
a DOS-compatible operating system. These include the traditional 
int 20h, 21h, 22h, 23h, 24h, 27h, 28h, and 2fh. 

Another less obvious interface is the device driver interface. This 
interface is also well documented and supported by many hardware and 
software vendors. It is important that DOS-C is compatible with this 
interface so that you can assure proper support of loadable device drivers. 

The kernel consists of C code wherever possible. However, certain 
functions are written in assembly language. These functions typically 
perform direct machine control, such as manipulating the stack and 
flipping interrupt bits. You will find this code in a number of assembly 
files along with C-to-assembly interface functions. 

Following a system call through the kemel is an easy way to take a 
tour through the kernel. Taking this journey and noting where certain 
features exist will help when you examine them more closely. 
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DOS-C API 


A DOS API entry point handler intercepts any system call to DOS-C 
and translates it into a C call. Do this as close to the entry point as pos- 
sible in order to use C as early as possible. The C call then does some 
context switching and performs a call to the appropriate internal service 
function. It is up to the service routine to route the system call through 
the kernel to the appropriate handler. 


Figure 3.2 DOS-C kernel architecture. 
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The path through the kernel varies depending on which DOS API 
entry point you examine. In order to continue the search deeper into the 
kernel, focus on the int 21h service function, int21_service. This 
handler is a dispatcher that examines the call number found in the regis- 
ter storage area. It uses this number to execute some local machine- 
dependent code that converts register variables to C parameters for 
function calls, calls an internal function that performs the call, and con- 
verts the retum values back into register values before returning. The 
dispatcher returns to the handler for another context switch and the 
entry point handler returns to the user. 

As mentioned earlier, the dispatcher performs the appropriate C 
function call to perform the DOS-C operation. DOS-C directs this call 
into the kernel and can only proceed “down” in the architecture. One 
architecture design rule is that functions are only aware of functions at 
the same or lower layer. This is a DOS-C design choice that allows the 
easy addition of functionality at the higher levels without affecting the 
operational layers. 

The DOS API layer directs a larger number of calls into the dosfns, 
or DOS Functions, layer. This layer works with the lower layers to add 
the DOS appearance to the system calls. It does this by combining calls 
into the various managers in the next lower layers with code to add fea- 
tures to the manager calls that are unique to a DOS interface. For exam- 
ple, this layer contains functions that encompass all rules for handling file 
I/O that deals with PSP, handles, or character and block I/O functionality. 


DOS-C File System 


The DOS-C design allows for a file manager that is independent from 
the design of the rest of the operating system. The fs module is the 
DOS-C file manager. Whenever an application requests an operation on 
a file that is resident on a block device, fs performs the operation. It 
does this by working with an internal table, called the f_node (or file 
node) table. 
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The f_node table is the foundation for the design of the FAT file 
system manager. The f_node data structure controls all internal file 
operations and virtualizes the file. When fs performs an operation such 
as create or open, it allocates an entry from the f_node table and popu- 
lates the f_node fields. The call returns a number that is the index into 
the f_node table. From here on out, any function that performs an oper- 
ation on the file receives this number. The f_node table entry that cor- 
responds to this index controls all file parameters. It is the anchor that 
ties together the three components of a file discussed in Chapter 2. 

If you are familiar with DOS internals from the undocumented inter- 
nals literature, you will recognize the need to map an internal f_node 
to the SFT (System File Table) structure used by DOS. The need for 
this mapping is a result of the many programs that expect an SFT entry 
for a file. This is one of the many unfortunate results of having an oper- 
ating system that is open to probing from user programs — you lose 
control of internal data structures. Changing any internal data structure 
may break a working application. It also complicates DOS-C design 
because these considerations will occur repeatedly and must be 
addressed in a manner that guarantees DOS compatibility. 

For file-related issues, you handle this through the dosfns layer 
function. The module dosfns handles this type of mapping, which is 
part of its “DOS personality” responsibility. At the fs level, it does its 
work in an “OS-neutral” mode in order to localize unique features to 
the personality module. This design lends itself to portability but will 
complicate some DOS personality issues. Fortunately, the benefits far 
outweigh the complications. 


DOS-C Memory Management 


Memory management in DOS-C is an area that requires close emula- 
tion of the MS-DOS techniques. Because DOS-C is a real-mode operat- 
ing system, the memory management techniques presented here are 
much simpler than those you will find if you examine more advanced 
operating systems. 
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In a manner similar to the MS-DOS memory management scheme, 
DOS-C manages memory through the use of a memory block linked 
list that uses the same arena header as the one described in Chapter 2. 

The arena header in DOS-C is known as the MCB (Memory Control 
Block). As with MS-DOS, this data structure is in front of each allo- 
cated and free memory block. DOS-C uses the same “M” and “Z” sig- 
natures to identify the block type. Also, the segment value of the PSP of 
the program it belongs to, or a 0008h if it belongs to DOS-C, identifies 
the owner of the block in the same way. DOS-C uses all other entries in 
an identical manner. Again, many programs make use of this informa- 
tion, and DOS-C provides this same information for the sake of com- 
patibility. Unlike fs where DOS personality splits between it and 
dosfns, memmor contains all DOS personality. 

DOS-C provides the same API calls as MS-DOS to manage the are- 
nas. It provides an allocation function and a dispose function along 
with a call used to set allocation strategy. Additionally, a number of 
calls are used internally within DOS-C for initialization of the arena 
and validation of the integrity of the arena whenever memory blocks 
retum to the free memory pool. 


DOS-C Task Management 


Because the DOS-C design adheres to the de facto DOS standard, it 
should not be surprising that the operating system is not multitasking. 
This is not entirely true as you will see in later chapters. There are 
really three tasks in DOS-C, the user task, the kemel task, and the 
driver task, but the user only sees a single task. The task manager, task, 
manages this single user task. 

The primary function of the task manager Is to act as a task loader, 
unlike other operating systems where task loading is incidental to man- 
aging the task. Also, there are only two relatively simple executable file 
types, .comand .exe. We have seen descriptions of both in Chapter 2, 
but we will examine the dynamics of loading the two types of tasks. 
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Both executable file types start from a single load entry. It is up to 
DOS-C to identify which file type it is. It does this by examining the 
first two bytes of the file. If it is an “MZ”, then it is an .exe file and 
task invokes the .exe loader, otherwise task assumes it to be a .com 
binary image and invokes the . com loader. 

Both loaders initially follow similar code. They both allocate mem- 
ory from memmgr to place the environment strings. The differences 
begin when the actual file loading occurs. The .com loader merely allo- 
cates memory and begins loading the file into memory for a maximum 
of 64Kb. The .exe loader, on the other hand, computes the size 
required and then allocates memory. It then proceeds to load the image. 
When task completes loading the image, it does a seek to the reloca- 
tion offset and does a segment fixup (necessary for the segmented 
architecture of the 80x86 family). 

From this point, both loaders proceed to create the PSP and clone 
the file table. They both end by allocating and initializing the tasks’ 
registers and executing the task if so requested. The only minor differ- 
ence here is that the initial ss:sp and cs:ip are default values for a 
.com, whereas task computes them from the .exe file header for the 
exe file. 

The user program termination handler is the final task code to be 
examined. In our task module, program termination is handled in a 
straightforward manner by closing all open files, restoring the parent’s 
information, and, if it is not a TSR, releasing the program’s memory. 


DOS-C High-Level I/O and Device Drivers 


So far, you have followed the system call through the DOS-C API and 
examined the various managers that provide the requested service. If 
the request is an I/O request, the system call now proceeds deeper into 
the kernel to perform raw I/O. The manager does this by placing a call 
into the next layer — the raw I/O handlers. 
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There are two handlers in the raw I/O layer, chario and blockio. 
Although each handler uses different methods, both handlers are the 
primary interface between the device drivers and the remainder of the 
kernel. Both handlers perform all the necessary buffer management for 
both types of I/O, including line buffer management. 

The I/O handlers are loosely based on the UNIX I/O model. As with 
UNIX, two types of I/O are defined. The first type is the character /O 
type. This type of I/O appears as a stream of bytes to the kemel. The 
kernel either sequentially reads a byte into the kemel or writes from the 
kernel. There are also functions to read a buffer into memory and han- 
dle the familiar DOS line editing functions. 

The block I/O interface provides functions to read and write a block 
of data to a block device — usually a disk. Each block corresponds to a 
sector and is actually a data structure in a block cache. When the kernel 
reads data from a disk, blockio reads the sector into a buffer and 
places it into a Least Recently Used (LRU) chain. When blockio 
needs a new buffer, it writes the tail of the list to disk, if needed, and 
returns that buffer. The buffer then becomes available for data transfer 
into the buffer. When b!ockio completes the data transfer, the buffer 
goes to the head of the LRU chain, indicating that it is the newest 
buffer. Management functions handle these operations for dirty buffer 
write back, as well as buffer fill. Buffer management functions also per- 
form operations such as LRU management and flush. 

As with the API, the device driver interface is well documented. It is 
important that DOS-C closely follow this convention. Because the 
interface is designed for an assembly language system, a special 
assembly language function interfaces all device drivers. This function, 
execrh( ), accepts a request packet from the I/O handlers that contains 
a function number requested by the device driver. It handles the correct 
sequence of calls for the device driver to the strategy and interrupt entry 
points and returns the packet to the I/O handler. The function execrh() 
is a C call, allowing an easy assembly language interface. 
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The DOS-C device drivers are the bottom layer of the kemel. Like 
the remainder of the kernel, C is used in these device drivers also. The 
device drivers perform the necessary device interface between the kernel 
and the device itself. DOS-C has the same device drivers as MS-DOS 
and they perform the same functions as their MS-DOS counterparts. 


Final Notes 


The discussion of the DOS-C architecture covers about 90% of the ker- 
nel. MS-DOS defines a few functions that don’t neatly fit the architec- 
ture. These functions, such as time management, National Language 
Support, etc., are handled individually in the layer that best supports 
them. Unfortunately, the design trade-offs forced special handling. As 
you will see in later chapters, the benefits of the layered design far out- 
weigh the inconvenience of special functions. 


Chapter 4 


DOS-C Kernel: 
File System Manager 


Up to now, I have covered generic DOS and DOS-C basics. I have 
looked at the architecture of DOS-C and discussed its layered 
approach, design rules, and calling conventions. The method I used was 
a hypothetical trip through the kermel, and I examined the flow of a 
generic system call through the kernel and the interface layers. 

This chapter examines the code in the kemel from the device inter- 
face layer up and starts by examining the device driver interface. This 
interface is an important piece of DOS-C, since all device driver calls 
use this interface. I will then look at the lower half of the file system 
managers, namely the block device and the character device interfaces. 
I will examine the data structures used in both and detail the design 
methodology behind each. Finally, I will tie these together by present- 
ing the FAT file system code. 
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Device Driver Interface 


The design of the code that performs the device driver interface is one of 
the critical areas in the kernel. In order to use standard DOS device driv- 
ers, the kernel must call the device driver in a manner similar to other 
DOS-compatible operating systems. This is necessary to guarantee com- 
patibility with third-party device drivers (i.e., your mouse driver, some 


Figure 4.1 Device driver functionality layer. 
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SCSI disk drivers, etc.). It must also be easy to use by the kermel’s C 
code modules. I achieve ease of use through a simple C function that 
receives a pointer to a standard DOS request packet and another pointer 
to a data structure, known as the dhdr, used to manage devices. This 
forms the device driver [execrh( )] interface layer (Figure 4.1). 

The realization of this design is an assembly code module that 
translates a C call into the appropriate device driver call. The file 
execrh.asm contains the actual assembly code; however, look at the 
pseudo code in Figure 4.2 in order to gain a better understanding of the 
process. The call takes two parameters, a far pointer, rhp, to the 
request packet data structure and another far pointer, dhp, to the dhdr 
data structure. 


Figure 4.2 Psuedo code for execrh() interface function. 


void 
execrh(request far *rhp, struct dhdr far *dhp) 
{ 

void (*TempFunc)(); 


/* perform c entry */ 

*--sp = bp: /* Actually a push instruction*/ 
bp = sp; 

sp -= 4; /* and make space for TempFunc*/ 


/* save old contents of temporary registers*/ 
*--sp = si; /* Actually a push instruction*/ 
*--sp = ds; /* ditto x], 


/* 
* Set up for standard DOS device driver call by assigning 
* passed in far pointers into their respective registers 
*/ 
ds:si = dhp;/* ds:es points to the device header */ 
es:bx = rhp;/* es:bx points to the request packet */ 
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The first parameter, rhp, is a pointer to the request packet data struc- 
ture as described in various DOS references. The request packet is a 
variable data structure used by DOS to command a device to perform 
control, input, or output functions. The information contained within 
the request packet is all that is required by the device driver to perform 
the requested function. This request packet is the standard request 
packet expected by DOS-compatible device drivers. 





Figure 4.2 Psuedo code for execrh() interface function — 
continued. 


~ 
* 


After translating C parameters to DOS standard 
device driver parameters, begin executing driver 
strategy and interrupt code by indirect calls 
found in the device driver header. Method used is 
assigning the function pointer to a temporary 
variable and using it for the function call. 


NOTE: device drivers use “tiny” model, so 
register adjustments are made along the way. 
/ 


+ %  % % He OF 


/* Call strategy first */ 
TempFunc = dhp -> dhStrategy; 
TempFunc(); 

/* Call Interrupt next */ 
TempFunc = dhp -> dhInterrupt; 
TempFunc( ); 


/* and exit perform ac exit */ 


ds = *spt+; /* Actually a pop instruction*/ 
si = *spt+ /* ditto */ 
sp = bp: 
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In order to properly identify the device that performs the I/O request, 
execrh( ) requires a second parameter, dhp, the device header, which is 
loosely based on a similar data structure discussed in popular “undocu- 
mented” literature. It contains a pointer to the device driver variable and 
other useful information such as current directory for block devices. 

Like other DOS-compatible operating systems, DOS-C creates a 
device I/O request in kernel space. Unlike some DOS-compatible sys- 
tems, DOS-C creates this space as an automatic variable in the function 
that is performing /O. DOS-C fills the packet then proceeds to call 
execrh( ) to execute the call. 

Upon entry into the function, execrh() performs a sequence of 
instructions that emulate a normal C function entry, including reserving 
space for the local variable TempFunc. It next prepares for a standard 
DOS device driver call by assigning f ar pointers passed into execrh( ) 
to their respective registers — ds:es for the device header and es: bx 
for the request packet. 

After translating the C parameters to DOS standard device driver 
parameters, execrh() continues by executing driver strategy and inter- 
rupt code with indirect calls found in the header of the device driver. 
The method used to perform the call is to load the function pointer into 
a temporary variable and use it for an indirect subroutine call. Note: 
Device drivers use the “tiny” model, so the actual assembly code makes 
register adjustments along the way. 


Character Device Interface 


The character device interface is one of the two lower-half functions 
used throughout the operating system (as diagrammed in Figure 4.3a, b). 
This interface handles both single-character I/O and buffered I/O, 
including command-line canonical processing. The file chario.c con- 
tains all functions related to this interface. With the understanding of 
how execrh( ) works, examine a simple function that uses it. 
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Let’s examine the low-level character output function _sto() (List- 
ing 4.1). This function is responsible for all output to the system con- 
sole. The first thing it does is look for a break (Ctrl-C or Ctrl-Break). I 
will discuss the exact mechanism later, but for now just assume the 
existence of a function that returns TRUE if a user has hit a Break key. If 
a break condition exists, _sto() handles it and exits immediately; if not, 
it proceeds to read the character. As mentioned before, the request 
packet is actually an automatic vanable allocated on the kernel stack; 
_sto() is a perfect example. The function fills the fields of the request, 
including the command requested. Each I/O function has different 
fields, but the concept is universal to all the low-level drivers. 


Figure 4.3a Character device functionality layer. 
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Figure 4.3b Low-level character output functions. 


| sti 
sti 


Listing 4.1 Source code for _sto() function. 


static BOOL 
_Sto(COUNT c) 
{ 
request rq; 
BYTE buf = c; 


if (con_break()) 
1 
handle_break(); 
return FALSE; 
i 
rq.r_length = sizeof(request); 
rq.r_command = C_OUTPUT; 
rq.r_count = 1; 
rq.r_trans = (BYTE FAR *)(&buf); 
rq.r_status = 0; 
execrh((request FAR *)&rq, (struct dhdr FAR *)&con_dev); 
if(rq.r_status & S_ERROR) 
return char_error(&rq, con_name) ; 
return TRUE; 
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After __sto() initializes the request packet, it invokes the execrh( ) 
function. As you saw earlier, this is where the device driver is called. 
When execrh() returns, _sto() examines the request packet status. If 
an error occurred, the kernel character error handler is invoked; other- 
wise, the function successfully returns. Again, this handling is typical 
for all the low-level device driver interface functions. 

A function similar to _sto() is the sto() function (Listing 4.2). 
The function _sto() provides only primitive functionality. Users of 
modem operating systems, have become accustomed to seeing tab 
expansion, screen hold (Ctrl-S and Ctrl-Q handling), etc. The sto() 
function provides this functionality. 


Listing 4.2 Source code for sto() function. 


VOID 

sto(COUNT c) 

( 
/* Test for hold char */ 
con_hold(); 


/* Display a printable character*/ 
if(c != HT) 
_Sto(c); 
if(c = CR) 
scr_pos = 0; 
else if(c = BS) 
{ 
if(scr_pos > 0) 
--SCr_pos; 
} 
else if(c = HT) 
{ 
do 
_sto(' '); 
while(++scr_pos & 7); 
} 
else if(c != LF && c != BS) 
++Scr_pos; 
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The first order of business for sto( ) is to check for the reception of a 
hold character (Ctrl-S). This is done through the special function 
con_hold( ), which pauses until a Ctrl-Q is received. As with other parts 
of DOS-C, sto() also builds on lower level functions, in this case 
_Sto(), to achieve its functionality. sto() tests for special characters 
such as a horizontal tab (HT), enter, or carriage return (CR). Tab charac- 
ters are expanded to multiples of eight spaces, whereas carriage return 
and backspace characters are intercepted by sto(), which modifies the 
global variable scr_pos, which tracks the current column position of the 
cursor. SCr_poOs is also used later in the command line editing functions. 

Similar to __sto() is_sti() (Listing 4.3). Like __sto(),_sti() isa 
lower level function that performs input. Two levels of character I/O are 
needed because DOS-compatible operating systems can supply both raw 
(no tab expansions, backspace characters that do not delete previous 
characters, etc.) and processed or “cooked” (tab expansions, backspace 


Listing 4.3 Source code for __sti() function. 


COUNT 
_sti(VOID) 
( 
BYTE cb; 
request rq; 


rq.r_length = sizeof(request); 
rq.r_command = C_INPUT; 
rq.r_count = 1; 
rq.r_trans = (BYTE FAR *)&cb; 
rq.r_status = 0; 
execrh((request FAR *)&rq, (struct dhdr FAR *)&con_dev); 
if(rq.r_status & S_ERROR) 
return char_error(&rq, con_name); 
if(cb = CTL_C) 
{ 
handle_break(); 
return CTLC; 
} 
else 
return cb; 
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characters that delete previous characters, etc.) /O. Because both input 
and output require the two levels of support, a similar design is used in 
both _sto() and _sti(). 

The function sti() supplies the next layer of functionality (Listing 4.4). 
Here, the symmetry breaks down because the DOS compatibility rule dic- 
tates that the next level fills a special keyboard buffer and performs the 
expected line editing. As a result, st i( ) is much longer that s to( ). 

Sti() begins by checking the line buffer that is passed in to see if it 
is large enough to hold the requested number of bytes and for a carriage 
retum terminating the buffer if it is not empty. The higher layer read 
functions use this particular feature. (The preprocessor switch NOSPCL 
is discussed later). 

Sti() then falls into a loop that terminates only when _sti() 
retums a carriage retum. At the beginning of the loop, sti() then tests 
a character with the use of a switch. The switch defaults to a simple 
entry of the character into the keyboard buffer with the use of a helper 
function, kbfil1l(). The switch captures other characters, such as 
backspace and escape, and diverts them to special handling code. It is 
this mechanism that produces the familiar repeated line response to the 
F3 function key and the erased character with both the backspace key 
and the back-arrow key. 

One interesting note here is the use of a static character array as a 
buffer for the last line. As with all DOS operating systems, DOS-C pro- 
vides the ability to recall the last line entered a character at a time with 
the use of the right-arrow key. When DOS-C receives a carriage return, 
it transfers the line into the static buffer 1ocal_buffer in order to 
properly perform line editing. DOS-C replaces the received character 
by the character contained in ]ocal_buffer at the correct position 
within the line as recorded by structure member, kp -> kb_count. 
This buffer is also used to repeat the entire last line whenever the F3 
function key is received. I must warn you that this particular design is 
not meant for multiprogramming environments because the use of a 
static variable defeats the reentrant requirement that all functions must 
follow in a true multiprogramming kernel. However, DOS-C is not 
multiprogramming, so you can allow some liberty and use simpler code 
in the function. 
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Listing 4.4 Source code for sti() function. 


VOID 
sti(keyboard FAR *kp) 
{ 

REG UWORD c, Cu_pos = Scr_pos; 

WORD init_count = kp -> kb_count; 
#Hifndef NOSPCL 

static BYTE local_buffer(LINESIZE); 
fendi f 


if(kp -> kb_size = 0) 
return; 
if(kp -> kb_size <= kp -> kb_count || kp -> kb_buf(kp -> 
kb_ count] 
'= CR) 
kp -> kb_count = 0; 
FOREVER 
[ 
switch(c = _sti()) 
( 
case CTL_F: 
continue; 


iHifndef NOSPCL 
case SPCL: 
switch(c = _sti()) 
( 
case LEFT: 
goto backspace; 


case F3: 
( 
REG COUNT i; 


for(i = kp -> kb_count; local_buffer[i] != ‘\0'; 


[ 
c = local_buffer(kp -> kb_count ]; 
if(c == ‘Ar’ [| c = ‘An') 
break; 
kbfill(kp, c, FALSE); 
) 
break; 
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Listing 4.4 Source code for sti () function — continued. 


case RIGHT: 
c = local_bufferlkp -> kb_count]; 
if(c == ‘\r’ [| c = ‘An') 

break; 

kbfill(kp, c, FALSE); 
break; 

} 

break: 

#tendif 


case CTL_BS: 
case BS: 
backspace: 
if(kp -> kb_count > 0) 
{ 
if(kp -> kb_buf(kp -> kb_count - 1] D=* ' 
{ 
destr_bs(); 
| 
else if((kp -> kb_buf(kp -> kb_count -1] < ‘ ‘) 
&& (kp -> kb_buf{kp -> kb_count - 1]!= HT)) 
{ 


destr_bs(); 
destr_bs(); 
] 
else if(kp -> kb_buf{kp -> kb_count -1] = HT) 
( 
do 
{ 
destr_bs(); 
| 
while((scr_pos > cu_pos) && (scr_pos & 7)); 
} 
--kp -> kb_count; 
| 
break; 


case CR: 
kbfill(kp, CR, TRUE); 
kbfill(kp, LF, TRUE); 
#ifndef NOSPCL 
fbcopy((BYTE FAR *)kp -> kb_buf. (BYTE FAR *)local_buffer, 
(COUNT)kp -> kb_count); 


local_buffer(kp -> kb_count] = ‘\0’; 
#tendif 
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The file chario.c also contains a number of helper functions that 
functions in chario.c and other files use. These functions perform 
tasks, such as converting displayed control characters into a printable 
form by placing a ““” in front of the corresponding printable character 
[mod_sto( )]. Other helper functions fill the keyboard buffer and pre- 
vent buffer overrun conditions while warning the user with an audible 
alert [kbfi11()]. I suggest that you study these functions in order to get 
a better understanding of the interrelationships between the main I/O 
functions and the helper functions. 

One final note about sti(). The DOS-C design allows for non- 
IBM-compatible I/O, such as a serial interface, where these function 
keys (F3, right arrow, and left arrow) may not exist. In order to work in 
this environment, you must define NOSPCL during the compilation of 
chario.c. This removes the local_buffer functionality from the 
function sti(). 


Listing 4.4 Source code for st i() function — continued. 


return; 


case LF: 
sto(CR); 
sto(LF); 
break; 


case ESC: 
sto("\\'); 
sto(CR); 
sto(LF); 
for(c = 0; c < Cu_pos; ct) 
sto(' '); 
kp -> kb_count = init_count; 
break; 


default: 
kbfill(kp, c, FALSE); 
break; 
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Block Device Interface 


The block device interface is similar to the character device design. 
Like its character counterpart, the block device interface has a lower 
layer that acts as the device driver interface and an upper layer that per- 
forms higher level functions such as buffering (Figure 4.4a,b). How- 
ever, they both differ due to the nature of the I/O. 

At the API level, any file or device can have a read or write request 
of an arbitrary number of bytes. For the character device interface, this 


Figure 4.4a Block device functionality layer. 
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is usually not a problem because character devices are capable of trans- 
ferring a single byte at a time. Block I/O devices must transfer data a 
fixed number of bytes at a time, 512 bytes for a DOS-compatible oper- 
ating system. Also, block devices are random access, meaning that any 
block can be accessed in any order, with finite rotational latency and 
cylinder stepping times incurred with each read from a disk block. 
Therefore, our lower layer for the design of the block device interface 
buffers these blocks in a way that reduces disk access times. DOS-C 
uses a linked list of block buffer structures that contains flags, pointers, 
and a data buffer. A last-recently-used list updated with every buffer 
access organizes the buffer structures. DOS-C reads the buffers into 
memory only when it needs a buffer not in the cache and writes a buffer 
to disk when it needs a new buffer that is not in the cache and the buffer 
removed from the end of the list was modified. DOS-C also writes to 
disk when it specifically flushes the disk buffers. 

Similar to the character I/O functions, the block I/O functions also 
have a device driver interface that converts the assembly language call- 
ing conventions to C. For our block I/O functions, the call is 
dskxfer( ) (Listing 4.5). 


Figure 4.46 Upper-layer block 1/O functions and lower-layer 
device driver interface. 
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On first examination of dskxfer(), the system call does not seem 
very different from its character counterpart except that the set-up of 
the request packet is inside an infinite loop. I will address the reason for 
this shortly, but it will be an important item to note. 

When you look closer at the set-up in dskxfer(), you see that the 
block unit passes a subunit. The use of a subunit allows a device driver to 
handle multiple drives (i.e., the floppy driver needs to handle both A and 
B drives while the hard disk driver handles C and D drives). You also see 
that the command can be any one of three: C_LOUTVFY, C_OUTPUT, and 


Listing 4.5 Source code for dskxfer() function. 


BOOL 
dskxfer (COUNT dsk, LONG blkno, VOID FAR *buf, COUNT mode) 
{ 

REG struct dpb *dpbp = &blk_devices[dsk]; 

request rq; 


for(;;) 
{ 

rq.r_length = sizeof(request); 

rq.r_unit = dpbp -> dpb_subunit; 

rq.r_command = 

mode = DSKWRITE ? 

(verify_ena ? C_OUTVFY : C_OUTPUT) 
: C_INPUT; 

rq.r_status = 0; 

rq.r_meddesc = dpbp -> dpb_mdb; 

rq.r_trans = (BYTE FAR *)buf; 

rq.r_count = 1; 

if(blkno > MAXSHORT ) 

[ 
rq.r_start = HUGECOUNT; 
rq.r_huge = blkno - 1; 

) 

else 
rq.r_start = blkno - 1; 

execrh( (request FAR *)&rqg, dpbp -> dpb_device); 

if(!(rq.r_status & S_ERROR) && (rq.r_status & S_DONE)) 
break; 
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C_INPUT. As you can see, dskxfer() addresses both input and output. 
Additionally, there are two output modes, a simple write mode 
(C_OUTPUT) and a write then verify mode (C_OUTVFY). At one time, disk 
hardware and media were not as reliable as they are today, so it was fool- 
ish to operate without this mode. Today’s hardware is much more reli- 
able, so you don’t really need to operate in this mode. The C_OUTVFY 
mode is also time consuming, requiring that the write operation complete 
and the disk spin one full revolution to perform the verify operation. 

The next area in which you will notice a difference is where 
dskxfer() passes the block number, b] kno, into the request packet. You 
will notice that when the block number is greater than MAXSHORT, 
dskxfer() sets the r_start field to HUGECOUNT and the r_huge field to 
blkno - 1; otherwise, dskxfer() sets the r_start fieldto blkno - 1. 


Listing 4.5 Source code for dskxfer() function — continued. 


else 
( 
loop: 
switch(block_error(&rq, dpbp -> dpb_unit)) 
{ 
case ABORT: 
case FAIL: 
return FALSE; 


case RETRY: 
continue; 


case CONTINUE: 
break; 


default: 
goto loop: 
) 
} 
J 
return TRUE; 
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The two cases are handled similarly because both require that b] kno be 
decremented, although the internal representation of a block device starts 
with one, whereas the device driver starts with zero. Two data structure 
members are needed. Because DOS now allows more than 65,536 blocks 
to a device. When the requested block exceeds 65,535, a 32-bit field is 
used instead of the traditional 16-bit field. The need for 32-bit block 
addresses resulted from the availability of drives with greater than 32Mb 
(65,535*512 bytes). The use of a split handler allows for backward com- 
patibility with older disk drivers. 

After dskxfer() fills the request packet, it makes a device driver 
call through the same execrh( ) function that the character device uses. 
It is after this call that the function differs from its character counter- 
part. Where the corresponding character device handler simply exits 
through the character error handler, dskxfer() has a loop. The loop 
Starts out by invoking the block error handler, the function that outputs 
the familiar “Abort, retry, fail” error message. Depending on what the 
error handler returns (presumably, the user responded with one of the 
three choices), the loop is responsible for the resulting action. For a 
retry response, the loop forces the whole operation to be retried through 
a simple C continue statement. For a fail or abort response, the loop 
forces an error return. For a continue response, the loop allows DOS-C 
to ignore the error altogether: a C break statement allows the function 
to exit through the normal function return so that a success is returned 
regardless of what the driver reported. Note that the error handler must 
return a valid condition; otherwise, dskxfer() repeats the loop until it 
receives a known return code. The use of the infinite loop forces the 
process to repeat until the user returns a valid response. 

The next step up from the disk driver interface dskxfer() is 
flushl() (Listing 4.6). This function moves up one level by working 
with the block structure, unlike dskxfer() which works with the data 
itself. This allows flushl() to work more efficiently through key 
block structure members and also gives it full control over when to do a 
physical write. 
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The first order of business for f1]ush1() is to examine the members 
b_flag, which indicates that the data in the buffer is valid if TRUE, and 
b_update, which indicates that the data in the buffer has been modi- 
fied. If both these members are true, flushl() transfers the data to 
disk. When flushl() completes the operation, it clears b_update, 
indicating the data has been saved to disk, and clears b_f1ag if an error 
occurred during the write process. 

Like its companion flushl(), fil1() also is in the next layer up 
from the device driver interface (Listing 4.7). Where f1ush1() handles 
the write operation, fi11() handles the read operation. fi11() is simi- 
lar to f 1}ush1() with one major exception: it must save the buffer, if it 
was modified to disk. In order to do this, the entry code is different. 

Upon entry, fi11() examines the member b_flag, which indicates 
whether or not it contains valid data. It also examines the member 
b_update, which records whether the data has been modified. If both 
conditions are true, fi11() calls flush1l() to clear the buffer before 
proceeding any further. Next, it performs a precautionary test to ensure 


Listing 4.6 Source code for flush1() function. 


BOOL 
flushl (struct buffer FAR *1pBlock) 
( 

REG WORD ok; 


if (lpBlock -> b_flag && IpBlock -> b_update) 
ok = dskxfer(1pBlock -> b_unit, IpBlock -> b_blkno, 
(VOID FAR *)1pBlock -> b _buffer, DSKWRITE); 
else 
ok = TRUE; 
lpBlock -> b_update = FALSE; /* even if error, 
/* mark not dirty 
lpBlock -> b_flag & ok; /* otherwise system has 
/* trouble continuing. 
return(ok); 
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that if the buffer did require a write operation, it completed properly. 
f111() accomplishes this through the local variable ok, which contains 
the return code if a write operation was performed. If the operation 
does not require a write, fi11() sets ok to TRUE in order to guarantee 
the read operation takes place. 

Once the dskxfer( ) operation completes, fi11() assigns the retum 
code to the member b_f1ag. This invalidates the buffer if the read oper- 
ation failed. Finally, f111() clears the member b_update, indicating 
that the buffer is clean, and updates the members b_bl kno and b_unit 
to identify the disk block associated with this buffer. 

With the single buffer operations in place, the next step up is the 
buffer chain operations. These operations are the ones that maintain the 
chain and implement the LRU functionality. There is really only one 
active chain entry point, getblock(), but there are a number of chain 


Listing 4.7 Source code for fi11() function. 


BOOL 
fill (REG struct buffer FAR *lpBlock, LONG blkno, COUNT dsk) 
{ 

REG WORD ok; 


if (1pBlock -> b_flag && IpBlock -> b_update) 
ok = flushl(1pBlock); 

else 
ok = TRUE; 


i f (ok) 
ok = dskxfer(dsk, blkno, (VOID FAR *)1lpBlock -> b_buffer, 
DSKREAD) ; 
lpBlock -> b_flag = ok; 
lpBlock -> b_update = FALSE; 
\pBlock -> b_blkno = bikno; 
lpBlock -> b_unit = dsk; 
return(ok); 
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maintenance functions. The function flush_buffers() is an example 
of a chain maintenance function (Listing 4.8). 

Sometimes DOS-C needs to ensure that all the disk buffers are writ- 
ten to disk. This condition may occur when all files on a disk are closed 
and files were written to. This is an example of when DOS-C calls 
flush_buffers(). 

The function is simple. It begins by initializing the variable 
1pBlock. It uses a while loop to walk the chain, examining each buffer 
on the chain for a match to the requested disk. If the buffer belongs to 
the requested disk, flush_buffers() invokes flushl() to write the 
buffer out. It then picks up the next buffer address from the b_next 
member and continues the loop. Note that the loop terminates with a 
null pointer that terminates the chain. 

Other simple functions perform operations similar to f ]ush_buffers(), 
such as init_buffers(), which initializes the chain; flush(), which 
flushes all buffers to disk; and setinv1d(), which invalidates a number 


Listing 4.8 Source code for flush_buffers () function. 


BOOL 

flush_buffers (REG COUNT dsk) 

( 
REG struct buffer FAR *1pBlock; 
REG BOOL ok = TRUE; 


lpBlock = firstbuf; 
while (1pBlock) 
{ 

if(1pBlock -> b_unit = dsk) 

if(!flushl(1pBlock) ) 
ok = FALSE; 

\pBlock = 1pBlock -> b_next; 
} 
return ok; 
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of buffers on the chain. These are chain maintenance functions whose 
need arises from various DOS-C conditions, but a single entry, as men- 
tioned earlier, performs the LRU maintenance. This results from the fun- 
damental design rule for block /O. DOS-C reads and writes all disk 
blocks through the buffer chain. This simplifies stream I/O at the higher 
levels. All higher level functions request a block and modify it. When it 
becomes necessary to update a block, DOS-C performs all updates 
through the buffer chain maintenance functions. As a result, there are no 
explicit read or write block calls. 


Listing 4.9 Source code for getblock() function. 


struct buffer FAR * 

getblock (LONG blkno, COUNT dsk) 

[ 
REG struct buffer FAR *1pBlock; 
REG struct buffer FAR *]pLastBlock; 
REG struct buffer FAR *|pMiddleBlock; 
REG WORD imsave; 


/* Search through buffers to see if the required block 
/* is already in a buffer 


IpBlock = firstbuf; 
lpLastBlock = NULL; 
lpMiddleBlock = NULL; 
while(}pBlock != NULL) 
{ 
if ((1pBlock -> b_flag) && (IpBlock -> b_unit = dsk) 
&& (1pBlock -> b_blkno = blkno) ) 
l 
/* found it -- rearrange LRU links 
if(1pLastBlock != NULL) 
[ 
}pLastBlock -> b_next = IpBlock -> b_next; 
lpBlock -> b_next = firstbuf; 
firstbuf = 1pBlock; 
} 
return(1pBlock); 
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The function getb1 ock( ) is responsible for returning a block to any 
function that needs it (Listing 4.9). In order to do this and follow the 
basic design rules, the design for getblock() is in two parts. First it 
searches the chain, and next it removes a block. How it removes the 
block is what determines the read or write operation. 

The function starts off by initializing three pointers: 1pBlock, 
IpLastBlock, and 1pMiddleBlock. The pointer 1 pBlock points to the 
current block and is the variable used to terminate the whi |e loop fol- 
lowing the initialization. Three conditions determine if a block is 
found: (1) the block is valid (b_f1lag), (2) it is from the correct disk 
(b_unit), and (3) it is the block we want (b_b1 kno). If these conditions 
are met, getblock( ) removes the buffer from the chain and places it at 
the front of the chain before returning the buffer address; otherwise, it 


Listing 4.9 Source code for getblock() function — continued. 


else 
( 
/* move along to next buffer*/ 
lpMiddleBlock = 1pLastBlock; 
IpLastBlock = 1pBlock; 
lpBlock = IpBlock -> b_next; 
| 
} 
/* The block we need is not in a buffer, we must make a */ 
/* buffer available, and fill it with the desired block */ 


/* detach lru buffer */ 
if( ]pMiddleBlock != NULL) 

IpMiddleBlock -> b_next = NULL; 
IpLastBlock -> b_next = firstbuf; 
firstbuf = lpLastBlock; 
if(flushl(]pLastBlock) && fill(]pLastBlock, blkno, dsk)) 
/* success */ 

lpMiddleBlock = IpLastBlock; 
else 

lpMiddleBlock = NULL; /* failure 
return (1pMiddleBlock); 
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reassigns the pointers and moves the current buffer pointer to the next 
buffer in the chain. If the while loop terminates, then the buffer is not 
on the chain and getblock() removes the last buffer from the chain. 
flushl() writes out the block’s contents, if necessary, and fil1() 
reads the desired block’s contents. getbl ock() then returns the block 
address to the caller, conditioned by error handling. 


FAT File System Functions 


Three files make up the fs module: fattab.c, fatdir.c, and 
fatfs.c. I chose to partition the functions that make up fs in this way 
so that each of these three files corresponds to the three distinct objects 
that make up the FAT file system disk discussed in Chapter 2. The file 
fattab.c deals with FAT handling on a device. It works with both 
12-bit and 16-bit FAT devices, and it treats both in a uniform fashion. 
The file fatdir.c handles all directory management and is responsible 
for directory searches (1.e., Find First and Find Next). The file fatfs .c 
handles all the remaining data functions and maps these functions into 
a support subset for the dosfns and fcbfns personality layers. 


FAT Management 


As you have seen before, a symmetry similar to the read and write sym- 
metry seen in both the character and block device interfaces exists for 
both operations of linking a FAT block number into a chain and retriev- 
ing a FAT block number from the FAT chain (Figure 4.5). The file 
fattab.c is made up of a FAT block retrieval portion (read) and a FAT 
block linking portion (write). The retrieval portion entry is 
next_cluster(). This function returns the next cluster in a chain 
given the current cluster. Similarly, the linking portion entry is 
link_fat(). This function takes two FAT cluster numbers and links 
the second cluster into the first cluster. 
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Retrieving a FAT Cluster Number 


The best way to look at these functions is to start with next_cluster() 
and then examine link_fat(). Because next_cluster() does not 
modify any data structures, I will use it to study the fundamental algo- 
rithms. I will proceed by looking at the 16-bit handlers, then the 12-bit 
handlers. The 16-bit handlers are easier to understand because they 
don’t have to deal with packing issues resulting from the mapping of 
12-bit values into byte-oriented data structures. Using this approach, 
you get an opportunity to look at the FAT closely before dealing with the 
complications of the linking functions. 


Figure 4.5 FAT file system functionality layer. 
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The design of the upper layer of the FAT file system requires that the 
FAT handlers treat all disks in a uniform manner. As discussed earlier, 
there are two different FAT sizes: 12-bit and 16-bit. Because the higher 
level functions don’t concern themselves with the type of FAT, only a sin- 
gle entry point is provided: next_cluster( ) (Listing 4.10). The require- 
ments of this function are relatively simple: it must decide what type of 
FAT the disk uses and then dispatch the correct handler. The macro 
ISFAT12 (Figure 4.6) is used to decide which function to call by first test- 
ing the disk parameter block dpb_size member to determine what size 
FAT the current device contains. This test looks to see if the size of the 
disk is less than or equal to FAT_MAGIC (4086) blocks. If it is, then the 
FAT is a 12-bit type and next_cluster() invokes next_cl12(); other- 
wise it invokes next_c116(). 

The two FAT handlers are both similar in design, but I will examine 
next_c116() first (Listing 4.11). This provides an opportunity to look 
cleanly at the algorithm that walks a linked list in the FAT. In each case, 
the algorithm begins by getting the physical block that corresponds to 
the given FAT number. It determines the blocks location on disk by 
computing the number of FAT entries within a single disk block and 
scaling the block number by the number of FAT entries within a block. 
For both the 12-bit and 16-bit cases, the algorithm then computes an 
index into the block it fetched earlier. It then retrieves the next cluster 
number from the block by using a byte-order-neutral function and 
returns the next cluster number to the caller. 


Listing 4.10 Source code for next_cluster() function. 


UWORD 
next_cluster (struct dpb *dpbp, REG UCOUNT ClusterNum) 
{ 
if (ISFAT12(dpbp)) 
return next_cll2(dpbp. ClusterNum): 
else if (ISFAT16(dpbp) ) 
return next_cll6(dpbp, ClusterNum): 
else 
return LONG_LAST_CLUSTER; 
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Figure 4.6 ISFAT12 macro for use with next_cluster(). 


define ISFATI2(dpbp) (((dpbp) ->dpb_size)<=FAT_MAGIC) 


WORD 
next_cluster (struct dpb *dpbp, REG UCOUNT ClusterNum) 
{ 

if (ISFAT12(dpbp)) 

return next_cll2(dpbp, ClusterNum) ; 

else if (ISFAT16(dpbp)) 

return next_cll6(dpbp, ClusterNum) ; 

else 

return LONG_LAST_CLUSTER; 
J 


Listing 4.11 Source code for next_c116() function. 


UWORD 
next_cll6 (struct dpb *dpbp, REG UCOUNT ClusterNum) 
| 

UCOUNT idx; 

struct buffer FAR *bp; 

UWORD RetCluster; 


/* Get the block that this cluster is in */ 
bp = getblock((LONG)(((LONG)ClusterNum) * SIZEOF_CLST16) / 
dpbp -> dpb_secsize + dpbp -> 
dpb_fatstrt + 1,dpbp -> dpb_unit); 
if(bp == NULL) 
return DE_BLKINVLD; 


/* form an index so that we can read the block as a 
/* byte array 
idx = (((LONG)ClusterNum) * SIZEOF_CLST16) % 

dpbp -> dpb_secsize; 


/* Get the cluster number, */ 
fgetword((VOID FAR *)&(bp -> b_buffer{idx]), 
(WORD FAR *)&RetCluster); 


/* and return successful. */ 
return RetCluster; 
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The differences between the 12-bit and 16-bit cases become appar- 
ent when you compare next_cl12() to next_c116(). next_cl12() is 
shown in Listing 4.12. The most obvious difference is in size. 
next_cl12() is larger than its 16-bit counterpart, which results from 
the special unpacking that it must perform. The 12-bit FAT packs two 
FAT numbers into 3 bytes, saving | byte for every two entries when 
compared to the 16-bit FAT. However, the complexity of this operation 
forces next_c112() to work harder. 

Like its 16-bit counterpart, next_cl12() begins by getting the 
block that corresponds to the given FAT cluster number. It also deter- 
mines the block by computing the number of FAT cluster entries within 





Listing 4.12 Source code for next_c112() function. 


UWORD 
next_cll2 (struct dpb *dpbp, REG UCOUNT ClusterNum) 
{ 

REG UBYTE FAR *fbp0O, FAR *fbpl; 

UCOUNT idx; 

struct buffer FAR *bp, FAR *bpl; 


/* Get the block that this cluster is in */ 

bp = getblock((LONG)((((ClusterNum << 1) + ClusterNum) >> 1) / 
dpbp -> dpb_secsize + dpbp -> 
dpb_fatstrt + 1), dpbp -> dpb_unit): 

if(bp = NULL) 


return BAD; 
/* form an index so that we can read the block as a */ 
/* byte array */ 
idx = (((ClusterNum << 1) + ClusterNum) >> 1) % dpbp -> dpb_secsize; 
/* Test to see if the cluster straddles the block. If it */ 
/* does. get the next block and use both to form the */ 
/* FAT word. Otherwise, just point to the next block. */ 


if(idx >= dpbp -> dpb_secsize - 1) 
{ 

bp1 = getblock((LONG)(dpbp -> dpb_fatstrt + 

((((ClusterNum << 1) + ClusterNum) >> 1) / 
dpbp -> dpb_secsize)) + 2, dpbp -> dpb_unit): 
if(bpl — (struct buffer *)0) 
return BAD; 
fbpl = (UBYTE FAR *)&(bpl -> b_buffer(0]): 


DOS-C Kernel: File System Manager — 99 


a single disk block and scaling it. At this point, the 12-bit algorithm 
diverges from the 16-bit algorithm. The 12-bit FAT table entry always 
straddles 2 bytes and the algorithm computes an index into the block. 
However, because of the 3-to-2 ratio of bytes-to-entries, certain block 
sizes may cause the last entry in the block to straddle into the next 
block. For example, in a 512-byte block, 341 entries occupy 511 bytes, 
leaving a single byte in each 512-byte block. This results in the unique 
situation that if the index points to the last byte in the block, the entry 
straddles the next block. 

At this point, next_cl12() determines if the FAT cluster number 
straddles the block with the use of two pointers to build the FAT cluster 
number. If the entry starts on the last byte of the block, next_c112() 
reads the next block and sets the second pointer to the first byte, other- 
wise it simply sets the second pointer to the next byte in the block. 

At this point, next_cl12() fetches the 2 bytes that compose the 
next cluster number from the block or blocks by using a special method 
that differentiates odd-numbered from even-numbered FAT entries, 
which are packed differently. next_cl12() performs a test for odd- 
numbered FAT entries and executes the correct unpacking procedure 


Listing 4.12 Source code for next_c112() function — 
continued. 


else 

fbpl = (UBYTE FAR *)&(bp -> b_bufferlidx + 1)); 
fbpO = (UBYTE FAR *)&(bp -> b_bufferLlidx]); 
/* Now to unpack the contents of the FAT entry. Odd and 
/* even bytes are packed differently. 
if (ClusterNum & 0x01) 

ClusterNum = ((*fbpO & Oxf0) >> 4) | *fbpl <« 4; 
else 

ClusterNum = *fbpO | ((*fbpl & Ox0f) << 8); 
if ((ClusterNum & MASK) == MASK) 

ClusterNum = LAST_CLUSTER; 
else if ((ClusterNum & BAD) = BAD) 

ClusterNum = BAD; 
return ClusterNum; 
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based on the outcome. The function returns this number unless it 
encounters a last cluster or bad cluster value, in which case it returns 
the 16-bit equivalent instead, satisfying the FAT size independence 
requirement of the upper layers. 


Writing a FAT Cluster Number 


As with the next_cluster() function, the higher level functions don’t 
concern themselves with the type of FAT, so only a single entry point is 
provided: link_fat() (Listing 4.13). The requirements of this func- 
tion are the same as those of next_cluster(): it must decide what 
type of FAT the disk uses and then dispatch the correct handler. It per- 
forms these tasks in exactly the same way as next_cluster() and dis- 
patches either ]ink_fat12() or link_fat16() for 12-bit and 16-bit 
FATs respectively. 

Again the two FAT handlers are both similar in design, and I will 
examine the 16-bit handler, ] ink_fat16(), first (Listing 4.14). As you 
already know, the 16-bit version is simpler than the 12-bit version 
because it doesn’t concern itself with packing and unpacking cluster 
numbers. 

As with the next_cluster() functions, the 1 ink_fat() algorithms 
begin by getting the block that corresponds to the given FAT number. 
They both determine the block by computing and scaling the number of 
FAT entries within a single disk block. For both the 12-bit and 16-bit 
cases, the algorithm then computes an index into the block and places 


Listing 4.13 Source code for 1ink_fat() function. 


UCOUNT 
link_fat (struct dpb *dpbp, UCOUNT Clusterl, REG UCOUNT Cluster2) 
( 
if (ISFAT12(dpbp ) ) 
return link_fatl2(dpbp, Clusterl, Cluster2); 
else if (ISFAT16(dpbp) ) 
return link_fatl6(dpbp, Clusterl, Cluster2); 
else 
return DE_BLKINVLD; 





DOS-C Kernel: File System Manager — 101 


the next cluster number into the block using a byte-order-neutral func- 
tion. Unlike the next_c116() function, 1 ink_fat16() must now mark 
the block as modified in order to guarantee that the FAT is properly 
updated. It also updates the dpb, which is an internal data structure 


Listing 4.14 Source code for link_fat16() function. 


UCOUNT 
link_fatl6 (struct dpb *dpbp. UCOUNT Clusterl, UCOUNT Cluster2) 
{ 

UCOUNT idx; 

struct buffer FAR *bp; 

UWORD C12 = Cluster2; 


/* Get the block that this cluster is in */ 

bp = getblock((LONG)(((LONG)Clusterl) * SIZEOF_CLST16) / 
dpbp -> dpb_secsize + dpbp -> dpb_fatstrt + 1, 
dpbp -> dpb_unit); 

if(bp == NULL) 


/* Finally, put the word into the buffer and mark the 

/* buffer as dirty. 

fputword((WORD FAR *)&C12, (VOID FAR *)&(bp -> 
b_buffer[idx])); 

bp -> b_update = TRUE; 


/* Return successful. 
/* update the free space count 
if(Cluster2 = FREE) 
{ 
/* update the free space count for returned cluster 
if(dpbp -> dpb_nfreeclst != UNKNCLUSTER) 
++dpbp -> dpb_nfreecist; 
J 
else 
{ 
/* update the free space count for removed cluster 
if(dpbp -> dpb_nfreeclst != UNKNCLUSTER) 
--dpbp -> dpb_nfreecist; 
J 


return SUCCESS; 
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associated with the disk. It increases the dpb_nf reec|lst member if the 
linked cluster is actually a free cluster entry and decreases it otherwise. 
This way, the structure is updated as we remove clusters from and 
retum clusters to the free pool. 


Listing 4.15 Source code for link_fat12() function. 


UCOUNT 
link_fatl2 (struct dpb *dpbp, UCOUNT Clusterl, UCOUNT Cluster2) 
{ 

REG UBYTE FAR *fbpO, FAR *fbpl; 

UCOUNT idx; 

struct buffer FAR *bp, FAR *bpl; 


/* Get the block that this cluster is in */ 
bp = getblock((LONG)((((Clusterl << 1) + Clusterl) >> 1) / 
dpbp -> dpb_secsize + dpbp -> dpb_fatstrt 
+ 1), dpbp -> dpb_unit); 
if(bp = NULL) 
return DE_BLKINVLD; 


/* form an index so that we can read the block as a 

/* byte array 

idx = (((Clusterl << 1) + Clusterl) >> 1) % dpbp -> 
dpb_secsize; 


/* Test to see if the cluster straddles the block. If it */ 
/* does, get the next block and use both to form the */ 
/* FAT word. Otherwise, just point to the next block. */ 
if(idx >= dpbp -> dpb_secsize - 1) 
{ 

bpl = getblock((LONG)(dpbp -> dpb_fatstrt + ((((Clusterl << 


+ Clusterl) >> 1) / dpbp -> dpb_secsize)) 
+ 2, dpbp -> dpb_unit); 
if(bpl — (struct buffer *)0) 
return DE_BLKINVLD; 
bpl -> b_update = TRUE; 
fbpl = (UBYTE FAR *)&(bpl -> b_buffer[0]); 
J 
else 
fbpl = (UBYTE FAR *)&(bp -> b_buffer[idx + 1]); 
fbpO = (UBYTE FAR *)&(bp -> b_bufferlidx]); 
bp -> b_update = TRUE; 





DOS-C Kernel: File System Manager — 103 


Like its 16-bit counterpart, the ]}ink_fat12() (Listing 4.15) func- 
tion begins by getting the block that corresponds to the given FAT num- 
ber. It also determines the block and fetches it. Next, it computes an 
index into the block, but unlike 1 ink_fat16(), it must pack the desired 
cluster into the block. To do this, ] ink_fat12() uses an algorithm sim- 
ilar to that used by next_cl12(). Once it packs the cluster number into 
the block, it must mark the block and update the dpb in the same way as 
link_fat1l6(). 


Listing 4.15 Source code for link_fat12() function — 
continued. 


/* Now pack the value in */ 
if (Clusterl & 0x01) 
( 


*fbpO = (*fbpO & OxOf) | ((Cluster2 & Ox0f) <«< 4); 
*fbpl = (Cluster2 >> 4) & Oxff; 
J 
else 
{ 
*fbpO = Cluster2 & Oxff; 
*fbpl = (*fbpl & Oxf0) |(Cluster2 >> 8) & OxOf; 


} 


/* update the free space count */ 
if(Cluster2 = FREE) 
( 
/* update the free space count for returned cluster */ 
if(dpbp -> dpb_nfreeclst != UNKNCLUSTER) 
++dpbp -> dpb_nfreeclist; 
} 
else 
( 
/* update the free space count for removed cluster x) 
if(dpbp -> dpb _nfreecist != UNKNCLUSTER) 
--dpbp -> dpb_nfreeclst; 
J 


return SUCCESS; 
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Directory Management 


Now that I have examined the functions that manage the disk FAT, I will 
move on to the directory management functions. This really puts you in 
an excellent position because you have seen the interaction of functions 
at this level with the getblock( ) support function. Many of the other 
functions at this level will operate in a similar fashion. You will also get 
a first look at a DOS service function. These functions typically have a 
DOS-like functionality associated with them but may deal with an inter- 
nal data structure or may only supply a portion of the DOS function. 


The dirent Structure 


Two important structures are used in directory management: dirent and 
f_node (Listings 4.16 and 4.17). The dirent structure is the first struc- 
ture I will examine, and it is also a key data structure. The structure’s 
design mirrors the FAT directory entry so that you can easily transfer 
directory data into and out of memory. It contains filename and extension 
in fields dir_name and dir_ext identical to the directory record on the 
disk. DOS-C maintains attributes, date, and time stamp in fields 
dir_attrib, dir_date, and dir_time that are in memory identical to 


Listing 4.16 Source code for dirent structure. 


struct dirent 
( 
UBYTE Filename 
dir_name[FNAME_SIZE]; 
UBYTE Filename extension 
dir_ext[FEXT_SIZE]; 
UBYTE File Attribute 
dir_attrib; 
BYTE dir_reserved(10]: reserved 
time dir_time; Time file created/updated 
date dir_date; Date file created/updated 
UWORD dir_start; Starting cluster 
lst available = 2 
ULONG dir_size; File size in bytes 
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their respective directory record entries. However DOS-C translates file 
size and starting cluster in dir_size and dir_start tothe machine’s 
native memory representation. For 80x86 machines, the memory format 
matches the disk format, but the code uses portable techniques for other 
architectures. As you will see, DOS-C uses this structure for every file 
access and maintenance routine. 


Listing 4.17 Source code for f_node structure. 


struct f_node 
[ 
UWORD f_count; number of uses of this file 
COUNT f_mode; read, write, read-write, etc 
struct 
{ 
BOOL directory has been modified 
f_dmod: 1; 
BOOL directory is the root 
f_droot:1; 
BOOL f_node is new and needs fill 
f_dnew:1; 
BOOL f_node is assigned to dir 
f_ddir:1; 
BOOL directory is full 
f_dfull:1; 
) f_flags; file flags 
struct dirent this file's dir entry image 
f_dir; 
ULONG f_diroff:; offset of the dir entry 
UWORD f_dirstart;: the starting cluster of dir 
when dir is not root 
struct dpb * the block device for file 
f_dpb; 
ULONG f_dsize; file size (for directories) 
ULONG f_offset; byte offset for next op 
ULONG f_highwater; the largest offset ever 
UWORD f_back; the cluster we were at 
UWORD f_cluster; the cluster we are at 
UWORD f_relcluster; the relative cluster 
UWORD f_sector; the sector in the cluster 
UWORD f_bof f; the byte in the cluster 
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The f_node Structure 


The f_node structure is the central data structure that every file and 
directory operation in fs uses. Within fs, there is a one-to-one relation- 
ship between the f_node (file node) structure and any file or directory. 
Understanding this structure is key to understanding fs itself. 

The first member in the structure is f_count. This member main- 
tains a count of the number of references (open, create, etc.) made to 
this file. When this member is zero, the f_node is free. The next mem- 
ber is f_mode. This member contains a flag that indicates what mode 
(read, wnite, read-write, etc.) the file is in. 

Many operations within fs go through states. For example, a file 
must be opened before it can be read. The state that it is in when open 
must be carried through from one operation to another. Additionally, 
other more subtle states need storage. The member f_f1]ags contains 
these bits. The first bit of importance is f_dmod, which indicates that 
the directory has been modified. This bit is examined when DOS-C 
closes a file and the directory entry is updated if it is set. The next bit is 
f_droot, which indicates that the directory associated with this f_node 
is the root directory and requires special handling. When an f_node is 
first allocated, a bit is necessary to indicate that the f_node is new and 
needs to be filled. Bit f_dnew indicates this state. Also, an f_node can 
be assigned either to a file or a directory. If bit f_ddir is TRUE, the 
f_node is assigned to a directory. Finally, on occasion, a directory 
search may encounter an “end of directory” condition, a state especially 
important when trying to obtain a free directory slot. Bit f_dful1] indi- 
cates that the directory is full when TRUE. 

The next member in the structure is the dirent f_dir structure. 
This is the file’s directory entry image. Every file operation, whether 
directory management or data manipulation, uses this entry. Associated 
with the directory image are two members that position the entry within 
its directory: f_diroff and f_dirstart. The member f_diroff 
records the byte offset of the directory entry. It is used primarily for 
locating the correct block and offset for directory read and write opera- 
tions. Member f_dirstart records the starting cluster of the directory 
when it is not the disk’s root directory. As discussed in Chapter 2, the 
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root directory is a sequential set of disk blocks fixed in both position 
and length, whereas subdirectories are special files that obey all file 
rules. As a result, functions that read and write the contents of a direc- 
tory must be aware of the distinction. The member f_dirstart is an 
optimization entry that allows DOS-C to do the disk search to find the 
start only once. As you will see later, this search can be costly for files 
that are many directories deep. 

The next entry is a pointer toa dpb struct. This is the disk parame- 
ter block for the block device corresponding to this file. dpb is a data 
structure that contains all disk-related information such as disk size, 
geometry, etc. It is this pointer that DOS-C uses to identify the disk. In 
fact, this is the structure built up by the Media Check Device Driver call. 

The next member, f_dsize, is a measure of the size in byes of direc- 
tories. DOS-C uses this member to maintain the size of the directory 
during directory searches. It sets this member when opening a directory 
and checks it when reading a directory. DOS-C also maintains this 
f_node member in Find First and Find Next functions, because these 
functions use the Directory Read function in their search. 

The next two members in the f_node data structure maintain the file 
position information. The first member, f_of fset, records the byte offset 
in the file for the next read or wnite operation. At the beginning of each 
data transfer operation, DOS-C translates f_offset into information on 
the physical position of the block and computes the index into the block. 
In addition, with each write operation, DOS-C compares f_offset to 
f_highwater and transfers f_offset to f_highwater if f_offset is 
greater than f_highwater. In this way f_highwater is the write high- 
water mark, recording the largest offset ever encountered in f_offset. 
You need to record this information in the event that an application per- 
forms a seek operation prior to a file write operation. If DOS-C wntes data 
to the file in a way that extends the file length, and a seek operation fol- 
lows that moves f_offset to a position within the file, f_highwater 
contains the true file size. DOS-C later uses f_highwater to update the 
directory entry when the application closes the file. 
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The member f_back contains the last cluster number of a sequential 
file access. It is used to link the FAT when new blocks are added to the 
file. Associated with this member is f_cluster. This member contains 
the current cluster number and is used to map the current position of the 
next read or write operation performed on the file. 

In order to map the file offset into the physical sector and the byte 
offset into the sector, structure members f_sector and f_boff are 
used to map the sector into the cluster and the byte offset into the clus- 
ter, respectively. These members are necessary because they must be 
maintained between read and write operations in order to lessen the 
overhead for each operation. 


Opening a Directory 


Now that I have examined the fundamental f_node data structure, I can 
begin to look at the file system functions. These functions are the first 
of the file system functions I will cover that make use of the f_node 
structure. They are also similar to the FAT functions in that they man- 
age the second part of a FAT file. 

DOS-C treats directories in a fashion similar to files. A directory is 
referenced identically to a file. For example, a directory has a path 
name that references the directory. It also has a directory entry in a 
directory with one notable exception — the root directory. Its content is 
a sequential set of records that are randomly accessed and updated as 
needed. In short, a directory resembles a file in many respects. For all 
these reasons, DOS-C manages directories in a fashion similar to files. 

Opening a file entails allocating an f_node kernel data structure and 
initializing it through a call to dir_open( ) (Listing 4.18). dir_open( ) 
does this by first allocating an f_node structure and presetting it to 
read-write mode. Any f_node that represents a directory must be 
placed into read-write mode so that DOS-C properly records any 
update made to the file by subsequent operations. 

Next you determine what drive you are using so that you can index 
the block device table. You need this information because this data struc- 
ture contains the entire state of the block device which may be modified 
by the user (e.g., change the current directory, write to a file, etc.). 
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Listing 4.18 Source code for dir_open() function. 


struct f_node * 
dir_open (BYTE FAR *dirname) 
( 
struct f_node *fnp; 
WORD drive; 
REG struct dpb *dpbp; 
struct f_node *get_f_node(); 
BYTE path[64], *s; 
/* Allocate an f_node if possible - error return (0) if not. */ 
if((fnp = get_f_node()) = (struct f_node *)0) 
[ 
return (struct f_node *)NULL; 
J 


/* Force the f_node into read-write mode */ 
fnp -> f_mode = ROWR; 


/* Determine what drive we are uSing... x 


/* for FAT-style file systems, if the second character of */ 


/* of the path = °:", then a drive specifcation was ay 
/* issued. If it was, update the path to point past the */ 
/* file specification and assign drive to the requested =f 


/* drive, otherwise ... */ 
if(*(dirname + 1) = ':') 
{ 

drive = *dirname - ‘A’; 

if(drive > 26) 

drive -= ('a' - ‘A'); 

dirname += 2; 

dpbp = &blk_devices[drive];: 
} 


else 

[ 

/* Select the default to help non-drive specified path wy 
/* searches... */ 


dpbp = &blk_devices[drive = default_drive]; 
| 
fnp -> f_dpb = dpbp: 
/* Generate full path name xy, 
if(*dirname = '/' || *dirname = ‘\\') 
fscopy(dirname, (BYTE FAR *)path); 
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You then generate the full path name so that you can properly locate 
the directory that contains its entry. Remember that a user can specify 
an absolute filename (i.e., c: \dos\format.exe) ora relative filename 
(i.e., ..\my_dir\my_file.txt). You parse the filename prepending 
the current directory for that drive if necessary, then clean up the path 
by eliminating redundant path separators and relative directory names 
such as “*.” and “..”. 


Listing 4.18 Source code for dir_open() function — 
continued. 


/* start with the logged in directory 
scopy(dpbp -> dpb_path, path); 

/* and append passed relative directory 
for(s = path; *s != ‘\0'; +); 

*stt = '\\'; 

fscopy(dirname, (BYTE FAR *)s); 


} 

/* then clean up the path 

trim_path(path); 

/* Determine if we are starting from the root... 


/* For FAT-style file systems, the root is a consective 

/* number of blocks given by the bpb. All sub-directories 

/* are files and need to be treated as such. 

fnp -> f_flags.f_droot = (*path = ‘/* || *path = ‘\\') 
&& (*(path + 1) = ‘\0'); 


/* Perform all directory common handling after al) 
/* special handling has been performed. 


++dpbp -> dpb_count; 

if(media_check(dpbp) < 0) 

( 
--dpbp -> dpb_count; 
release_f_node(fnp); 
return (struct f_node *)0; 
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Now the fun begins. You need to determine if you are starting from 
the root because for FAT-style file systems, the root is a consecutive 
number of blocks given by the dpb. However, all subdirectories are files 
and dir_open() handles them as such. Because of the differences 


Listing 4.18 Source code for dir_open() function — 
continued. 


fnp -> f_diroff = 01; 

fnp -> f_flags.f_dmod = FALSE; /* a brand new f_node 
fnp -> f_flags.f_dnew = TRUE; 

fnp -> f_dsize = DIRENT_SIZE * dpbp -> dpb_dirents; 
if(!fnp -> f_flags.f_droot) 

{ 


BYTE dbuf f{FNAME_SIZE+FEXT_SIZEJ, *p; 
WORD i; 
/* Walk the directory tree to find the starting cluster 
/* 
/* Set the root flags since we always start 
/* from the root 
fnp -> f_flags.f_droot = TRUE; 
for(p = path; *p != ‘\0'; ) 
( 
/* skip all path seperators 
while(*p = '/' || *p = ‘\\') 
++p; 
/* don't continue if we're at the end 
if(*p = '\0') 
break; 


/* Convert the name into an absolute ™/ 
/* name for comparison... xf 
/* first the file name with trailing spaces... */ 
for(i = 0; i < FNAME_SIZE; i++) 
( 
if(*p = ‘\0' && *p '='." && *p != '/* && *p != ‘\\') 
dbuff{i] = *pt+t; 
else 
break; 
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between a FAT root directory and subdirectory, the initial task of find- 
ing the starting point of the directory differs, and dir_open() handles 
each case individually. 


Listing 4.18 Source code for dir_open() function — 
continued. 


for( ; i < FNAME_SIZE; i++) 
dbufffiJ =' '; 


/* and the extension (don't forget to 
/* add trailing spaces)... 
if(*p = '.") 
+); 
for(i = 0; i < FEXT_SIZE; i++) 
[ 
if(*p '='"\0' 6& *p != '.' && *p != '/* && *p != '\\") 
dbuf f(i+FNAME_SIZE] = *pH; 
else 
break; 
| 
for( ; i < FEXT_SIZE; i++) 
dbuff(i+FNAME_SIZE] =‘ '; 


/* Now search through the directory to 

/* find the entry... 

i = FALSE; 

touc( (BYTE FAR *)dbuff, FNAME_SIZE+FEXT_SIZE); 
while(dir_read(fnp) = DIRENT_SIZE) 

[ 


if(fnp -> f_dir.dir_name([0] != '\0' && fnp -> 
f_dir.dir_name{0} != DELETED) 
{ 


if(fcmp( (BYTE FAR *)dbuff, (BYTE FAR *)fnp -> 
f_dir.dir_name, FNAME_SIZE+FEXT_SIZE)) 
{ 
i = TRUE; 
break; 
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After dir_open() completes all special handling it performs all 
common directory handling. It increments the reference count for the 
dpb in order to keep track of the number of files using this block device. 
It also performs a media check to make sure the user did not change the 
media in the drive, and it corrects internal data structures if they are dif- 
ferent. This is one important difference between MS-DOS-style operat- 
ing systems and UNIX-style operating systems. MS-DOS-style 
operating systems allow you to change the disk on the fly, whereas 


Listing 4.18 Source code for dir_open() function — 
continued. 


if(!i || !(fnp -> f_dir.dir_attrib & D_DIR)) 
{ 
--dpbp -> dpb_count; 
release_f_node(fnp); 
return (struct f_node *)0; 
else 
{ 
/* make certain we've moved off root 
fnp -> f_flags.f_droot = FALSE; 
fnp -> f_flags.f_ddir = TRUE; 
/* set up for file read/write 
fnp -> f_offset = 01; 
fnp -> f_highwater = 01; 
fnp -> f_cluster = fnp -> f_dir.dir_start; 
fnp -> f_dirstart = fnp -> f_dir.dir_start: 
/* reset the directory flags 
fnp -> f_diroff = 01; 
fnp -> f_flags.f_dmod = FALSE; 
fnp -> f_flags.f_dnew = TRUE; 
fnp -> f_dsize = DIRENT_SIZE * dpbp -> dpb_dirents; 


} 
| 
return fnp; 
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UNIX-style operating systems require that the disk be mounted prior to 
use. Any file system open or create will always do a media check as a 
pseudo-mount operation. 

If the media check is successful, dir_open( ) initializes the f_node 
data structures to point to the beginning of the directory, ready for a 
read operation. Part of this initialization is walking the directory tree to 
find the starting cluster. Remember that the search may need to go 
through several directories, venf ying the path, in order to find the start- 
ing cluster. This search first sets the root flags because the trimmed path 
is an absolute path that always starts from the root directory. 
dir_open( ) adjusts each component of the path to match an MS-DOS 
directory name entry by adding trailing spaces and concatenating the 
extension. It then performs a comparison between the adjusted name 
and the name currently in the f_node. When a match is found, 
dir_open( ) records the starting cluster into the f_node. dir_open( ) 
repeats this process until the path is exhausted and the f_node contains 
the starting cluster of the full path. 


Listing 4.19 Source code for dir_read() and dir_write() 
functions. 


COUNT 
dir_read (REG struct f_node *fnp) 
{ 
REG i, j: 
struct buffer FAR *bp; 
/* Directories need to point to their current offset, 
/* not for next op. Therefore, if it is anything other 
/* than the first directory entry, we will update the 
/* offset on entry rather than wait until exit. If it 
/* was new, clear the special new flag. 
if(fnp -> f_flags.f_dnew) 
fnp -> f_flags.f_dnew = FALSE; 
else 
fnp -> f_diroff += DIRENT_SIZE; 
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When dir_open() successfully completes, it returns a pointer to the 
initialized f_node data structure. This is an important feature to note 
because this pointer is the internal representation of a file. All file oper- 
ations work on this internal data structure and DOS-C converts all user 
handles to an f_node pointer through the interface functions for inter- 
nal file operations. 


Reading/Writing a Directory Entry 


Once a directory is opened, it can either be read from or written to. 
DOS-C provides two functions for this purpose: dir_read() and 
dir_write() (Listing 4.19). DOS-C only uses these functions in file 


Listing 4.19 Source code for dir_read() and dir_write() 
functions — continued. 


Determine if we hit the end of the directory. If we 
have, bump the offset back to the end and exit. If 
not, fill the dirent portion of the f_node, clear the 
f_dmod bit and leave, but only for root directories 
if(!(fnp -> f_flags.f_droot) 
&& fnp -> f_diroff >= fnp -> f_dsize) 
[ 
fnp -> f_diroff -= DIRENT_SIZE; 
return 0; 
) 
else 
( 
if(fnp -> f_flags.f_droot) 
( 
if((fnp -> f_diroff / fnp -> f_dpb -> dpb_secsize 
+ fnp -> f_dpb -> dpb_dirstrt) 
>= fnp -> f_dpb -> dpb_data) 
{ 
fnp -> f_flags.f_dfull = TRUE; 
return 0; 
} 
bp = getblock((LONG)(fnp -> f_diroff / fnp -> f_dpb -> 
dpb_secsize + fnp -> f_dpb -> 
dpb_dirstrt), 
fnp -> f_dpb -> dpb_unit); 
} 
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Listing 4.19 Source code for dir_read()and dir_write() 
functions — continued. 


REG UWORD secsize = fnp -> f_dpb -> dpb_secsize; 
/* Do a “seek” to the directory position 
fnp -> f_offset = fnp -> f_diroff; 
/* Search through the FAT to find the block 
/* that this entry is in. 
if(map_cluster(fnp, XFR_READ) != SUCCESS) 
{ 
fnp -> f_flags.f_dfull = TRUE; 
return 0; 


/* If the returned cluster is FREE, return zero 
/* bytes read. 
if(fnp -> f_cluster = FREE) 

return 0; 


/* If the returned cluster is LAST_CLUSTER or 
/* LONG_LAST_CLUSTER, return zero bytes read 
/* and set the directory as full. 
if (last_link(fnp)) 
{ 
fnp -> f_diroff -= DIRENT_SIZE; 
fnp -> f_flags.f_dfull = TRUE; 
return 0; 
| 
/* Compute the block within the cluster and the 
/* offset within the block. 
fnp -> f_sector = 
(fnp -> f_offset / secsize) 
& fnp -> f_dpb -> dpb_clsmask; 
fnp -> f_boff = fnp -> f_offset % secsize; 


/* Get the block we need from cache 
bp = getblock( 

(LONG)clus2phys(fnp -> f_cluster, 
fnp -> f_dpb -> dpb_clssize, 
fnp -> f_dpb -> dpb_data) 

+ fnp -> f_sector, 
fnp -> f_dpb -> dpb_unit); 
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open, create, and close operations and they cannot be accessed by the 
user. They are only called in an indirect fashion. As you have done 
before, start with the read function and then note the symmetry and dif- 
ferences between the read and write functions. 

Before I dive into the operation, you should note one difference 
between directory operations and normal file operations. Data read from 
or written to a directory is contained within the f_node data structure 
contained in kernel memory, not in user memory. As a result, DOS-C 
directories need to point to their current offset because of the record 
nature of the directory entry. This means that if it is anything other than 
the first directory entry, you update the offset on entry then wait until 
exit as you would do for normal file data operations. 


Listing 4.19 Source code for dir_read() and dir_write() 
functions — continued. 


/* Now that we have the block for our entry, get 
/* the directory entry. 
if(bp != NULL) 
getdirent((BYTE FAR *)&bp -> b_buffer(fnp -> 
f_diroff % fnp -> f_dpb -> dpb_secsize], 
(struct dirent FAR *)&fnp -> f_dir); 
else 
{ 
fnp -> f_flags.f_dful) = TRUE; 
return 0; 
| 


/* Update the f_node‘s directory info 
fnp -> f_flags.f_dfull = FALSE; 
fnp -> f_flags.f_dmod = FALSE; 


/* and for efficiency, stop when we hit the first 
/* unused entry. 
if(fnp -> f_dir.dir_name(O] = ‘\0') 
return 0; 
else 
return OIRENT_SIZE; 
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Following this rule, dir_read() starts exactly this way. The algo- 
rithm also must determine if you hit the end of the directory. If you have, 
you need to make certain that you do not proceed past the last record; 
otherwise, you fill the di rent portion of the f_node for root directories 
or do a “seek” to the directory position and read the cluster number. If 
the returned cluster is FREE or itis LAST_CLUSTER, you return zero bytes 
read; otherwise, you compute the block within the cluster and offset 
within the block. You then read the directory entry into the dirent por- 
tion of the f_node and return DIRENT_SIZE bytes read. 

Function dir_write() functions slightly differently. It begins by 
determining if the f_node was modified by a write or create operation. 
If so, the function updates the disk block containing the entry and 
marks the buffer so that the buffer cache functions will write the block 
to disk when needed. This is simple for root directories because they 
are sequential disk blocks and a seek is a simple computation. All other 
directories require different handling. 


Listing 4.19 Source code for dir_read() and dir_write() 
functions — continued. 


COUNT 
dir_write (REG struct f_node *fnp) 
( 

struct buffer FAR *bp; 


/* Update the entry if it was modified by a write or create... */ 
if(fnp -> f_flags.f_dmod) 
{ 
/* Root is a consecutive set of blocks, so handling “i 
/* is simple. xy 
if(fnp -> f_flags.f_droot) 
( 
bp = 
getblock( 
(LONG)(fnp -> f_diroff / fnp -> f_dpb -> dpb_secsize 
+ fnp -> f_dpb -> dpb_dirstrt), 
fnp -> f_dpb -> dpb_unit); 
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Listing 4.19 Source code for dir_read() and dir_write() 
functions — continued. 


/* All other directories are just files. The only 
/* special handling is resetting the offset so that 
/* we can continually update the same directory entry. 
else 
( 

REG UWORD secsize = fnp -> f_dpb -> dpb_secsize; 


/* Do a “seek” to the directory position 

/* and convert the f_node to a directory f_node. 
fnp -> f_loffset = fnp -> f_diroff; 

fnp -> f_back = LONG_LAST_CLUSTER; 

fnp -> f_cluster = fnp -> f_dirstart; 

/* Search through the FAT to find the block 

/* that this entry is in. 

if(map_cluster(fnp, XFR_READ) != SUCCESS) 

( 


fnp -> f_flags.f_dfull = TRUE; 
release_f_node( fnp); 
return 0; 


/* If the returned cluster is FREE, return zero 
/* bytes read. 
if(fnp -> f_cluster = FREE) 
( 
release_f_node( fnp); 
return 0; 
) 


/* Compute the block within the cluster and the 
/* offset within the block. 
fnp -> f_sector = 
(fnp -> f_offset / secsize) 
& fnp -> f_dpb -> dpb_clsmask; 
fnp -> f_boff = fnp -> f_offset % secsize; 


/* Get the block we need from cache 
bp = getblock( 

(LONG)clus2phys(fnp -> f_cluster, 
fnp -> f_dpb -> dpb_clssize, 
fnp -> f_dpb -> dpb_data) 

+ fnp -> f_sector, 
fnp -> f_dpb -> dpb_unit); 
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Note that all directories other than root are just files. However, you 
do need to remember that directory operations are record-style opera- 
tions. This means that the only special handling is resetting the offset so 
that you can continually update the same directory entry. dir_write() 
handles the seek to the correct record in a similar fashion to the 
dir_read( ) algorithm, and the disk block is updated. When you suc- 
cessfully complete, you return DIRENT_SIZE bytes written. 


Closing a Directory 


dir_close() is the function responsible for closing a directory (List- 
ing 4.20). Because of the f_node design, it is surprisingly simple. First, 
dir_close() tests for invalid f_nodes to guarantee code integrity, 
although the f_node should be valid at this point. If the f_node con- 
tains garbage or is an invalid pointer, a kernel crash or corruption of the 
file system could be the result, so dir_close( ) guards against invalid 
f_nodes. 


Listing 4.19 Source code for dir_read() and dir_write() 
functions — continued. 


/* Now that we have a block, transfer the diectory */ 
/* entry into the block. */ 
if(bp — NULL) 
{ 
release_f_node( fnp): 
return 0; 
| 
putdirent((struct dirent FAR *)&fnp -> f_dir, 
(VOID FAR *)&bp -> b_buffer[fnp -> 
f_diroff % fnp -> f_dpb -> dpb_secsize]):; 
bp -> b_update = TRUE; 
] 
return DIRENT_SIZE; 
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dir_close() then proceeds to write the entry through a call to 
di r_write(). This is aconditional write because the dir_write( ) func- 
tion examines the f_node to see if a prior call modified it and only per- 
forms the write if necessary. This optimization helps performance by 
reducing unnecessary disk access. Remember, disks are electromechanical 
devices and their speed is very slow when compared to processor speeds. 

Next, dir_close() updates the buffer cache by calling 
flush_buffers(), causing all modified disk buffers, which include 
both file and directory data, to be written to disk. Again, 
flush_buffers() only conditionally updates modified buffers for 
optimization purposes. Once dirty buffers in the buffer cache are safely 
on disk, dir_close() decrements the reference count and releases the 
instance of the f_node. Note that in order to correctly share files when 
users spawn child processes, release_f_node() only returms the 
f_node to the f_node pool when the reference count is zero. 


Listing 4.20 Source code for dir_close() function. 


VOID 
dir_close (REG struct f_node *fnp) 
{ 
REG COUNT disk = fnp -> f_dpb -> dpb_unit; 


/* Test for invalid f_nodes 
if(fnp — NULL) 
return; 


#ifndef IPL 
/* Write out the entry 
dir_write(fnp); 

fendi f 


/* Clear buffers after release 
f lush_buffers(disk); 
setinvld(disk); 


/* and release this instance of the f_node 
--(fnp -> f_dpb) -> dpb_count; 
release_f_node(fnp) ; 
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File Management 


Find First/Next 


I now look at the first DOS support functions. Unlike UNIX-like oper- 
ating systems, a user cannot directly access an MS-DOS directory. 
However, the need for reading the directory contents is important. A 
user may want to look for a specific file, get the date or access rights of a 
file, or simply list the contents of the directory. The designers of MS-DOS 
designed the dos_findfirst() and dos_findnext() functions to 
handle these situations (Listing 4.21). 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext() functions. 


COUNT 
dos_findfirst (UCOUNT attr, BYTE FAR *name) 
[ 
REG struct f_node *fnp; 
REG dmatch FAR *dmp = (dmatch FAR *)dta; 
Struct dosnames DosName; 
REG COUNT i; 
BYTE *p; 


/* The findfirst/findnext calls are probably the worst 

/* of the DOS calls. They must work somewhat on the fly 

/* (i.e. - open but never close). Since we don't want to 
lose f_nodes every time a directory is searched, we wil] 
initialize the DOS dirmatch structure and then for 
every find, we will open the current directory, do a 
seek and read, then close the f_node. 


/* Start out by initializing the dirmatch structure. 
dmp -> dm_drive = default_drive; 

dmp -> dm_entry = 0; 

dmp -> dm_cluster = 0; 


dmp -> dm_attr_srch = attr; 


if(DosNames(name, (struct dosnames FAR *)&DosName) != SUCCESS) 
return DE_FILENOTFND; 
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These function calls must work somewhat on the fly. If you do not 
properly design the algorithm, dos_findfirst() and dos_findnext( ) 
could end up opening a directory and reading it but never closing it. This 
would cause you to loose an f_node for every directory search. To open 
a directory and never close it, you must properly implement the support 
functions, dir_open(), dir_read( ), and dir_write(). 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext() functions — continued. 


dmp -> dm_drive = DosName.dn_drive; 


/* Build the match pattern out of the passed string 
for(p = DosName.dn_name, i = 0; i < FNAME_SIZE; i++) 
[ 

/* test for a valid file name terminator 


if(*p $= '\0' && *p != ''.') 
[ 
/* If not a wildcard (‘*'), just transfer 
VEC OWS Fe") 
dmp -> dm_name_pat({i] = *p+: 
else 


{ 
/* swallow the wildcard 
Hp; 

/* fill with character wildcard (7?) 

for( ; i < FNAME_SIZE; it+) 
dmp -> dm_name_pat({i] = ‘?'; 

/* and skip to seperator 

while(*p != '\Q0' && *p != ''." @& *p 

f= '"/' B& *p t= ‘\\') 
+4); 


break; 
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Because you don’t want to lose f_nodes every time an application 
searches a directory, you initialize the MS-DOS dirmatch structure 
and then for every find, you open the current directory, do a seek and 
read, then close the f_node. You also store all necessary information in 
the dirmatch structure that is in user space. This way, the user indi- 
rectly handles f_node resource management. 

Look a little closer at dos_findfirst(). The function begins by 
initializing the dirmatch structure. Next, it parses the pattern in order 
to create an expanded pattern in the dirmatch structure to use in the 


Listing 4.21 Source code for dos_findfirst() and 
dos_ findnext() functions — continued. 


for( ; i < FNAME_SIZE; i++) 
dmp -> dm_name_pat({i] =‘ '; 


/* and the extension (don't forget to add trailing spaces)... 
TEC*p = 7) 
( 
++); 
for( ; i < FEXT_SIZE; i++) 
{ 
if(*p '= '\0' 6& *p != '.' B& *p != '/* BR *p JH 
{ 
LEAP dao oe) 
dmp -> dm_name_pat(i+FNAME_SIZE] = *p+; 
else 
{ 
for( ; i < FEXT_SIZE; i++) 
dmp -> dm_name_pat([i+FNAME_SIZE] = '?'; 
break; 
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directory reads. Note that the expanded pattern removes all *“**” charac- 
ters and fills the pattern with the character wildcard “*?”. The algorithm 
later used in dos_find() next automatically matches the character wild- 
card in the source pattern. This is how DOS-C implements wildcards. 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext() functions — continued. 


for( ; i < FEXT_SIZE; i++) 
dmp -> dm_name_pat[i+FNAME_SIZEJ] = '' '; 


/* Now search through the directory to find the entry... 
touc((BYTE FAR *)dmp -> dm_name_pat, FNAME_SIZE+FEXT_SIZE); 


/* Special handling - the volume id is only in the root 
/* directory and only searched for once. So we need to 
/* open the root and return only the first entry that 
/* contains the volume id bit set. 
if(attr & D_VOLID) 
{ 
/* Now open this directory so that we can read the 
/* f_node entry and do a match on it. 
if((fnp = dir_open( (BYTE FAR *)"\\")) == NULL) 
return DE_PATHNOTFND; 
/* Now do the search 
while(dir_read(fnp) = DIRENT_SIZE) 
( 
/* Test the attribute and return first found 
if(fnp -> f_dir.dir_attrib & D_VOLID) 
{ 
pop_dmp(dmp, fnp); 
dir_close(fnp); 
return SUCCESS; 


/* Now that we've done our failed search, close it and 
/* return an error. 

dir_close(fnp); 

return DE_FILENOTFND; 
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With the initialization complete, now search through the directory to 
find the entry. Use dos_findnext() to retum the first match, but first 
you must perform some special handling. The volume ID is only in the 
root directory and only searched for once. Whenever the attributes 
passed in from the user require a volume ID search, you need to open 
the root directory and return only the first entry where the volume ID 
bit is set. Open the root directory and search by using a whi |e loop that 
reads the directory entry and terminates when you reach the end of the 
directory. Within the loop, test the attribute of the directory entry just 
read and retum the first directory entry found (if any). If you do find a 
match, dos_findfirst() populates the users dirent structure by 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext() functions — continued. 


/* Otherwise just do a normal find next 
else 


{ 
BYTE LocalPath[67]; 


/* Build the full path so that we can open it 
sprintf(LocalPath, "%c:%s", ‘A’ + dmp -> dm_drive, 
*DosName.dn_path = ‘'\O' ? “." : DosName.dn_path); 


/* Now open this directory so that we can read the 

/* f_node entry and do a match on it. 

if((fnp = dir_open((BYTE FAR *)LocalPath)) == NULL) 
return DE_PATHNOTFND; 


pop_dmp(dmp, fnp); 
dmp -> dm_entry = 0; 
if(!fnp -> f_flags.f_droot) 
dmp -> dm_cluster = fnp -> f_dirstart; 
else 
dmp -> dm_cluster = 0; 
dir_close(fnp); 
return dos_findnext(); 
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using the pop_dmp( ) function, closes the file, and returns successfully. 
If dos_findfirst() did not find the entry, it closes the root directory 
and returms an error. 

If the volume ID attribute bit is not set, dos_findfirst() uses 
dos_findnext() to find the first match. dos_findfirst() prepares 
for the call by first building the full path so that it can open the direc- 
tory using dir_open( ). It opens this directory so that it can determine 
the starting cluster of the directory and save the cluster in the users 
dirent structure. Finally, it closes the directory, which in tum returns 
all resources, and exits through dos_findnext(). 

Although dos_findfirst() did a search for volume ID, 
dos_findnext() performs the bulk of the work. This partitioning of 
functionality guarantees identical searches for both functions. This is 
why dos_findfirst() exits through dos_findnext(). 

To start, dos_findnext() assigns a pointer to the match parame- 
ters. This pointer is actually a pointer to the disk transfer area because 
MS-DOS specifies that the dirmatch structure be contained within the 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext () functions — continued. 


COUNT 
dos_findnext (void) 
{ 
REG dmatch FAR *dmp = (dmatch FAR *)dta; 
REG struct f_node *fnp; 
BOOL found = FALSE; 
BYTE FAR *p, *q; 


/* assign our match parameters pointer. xy 
dmp = (dmatch FAR *)dta; 


/* Allocate an f_node if possible - error return (0) if not. */ 
if((fnp = get_f_node()) = (struct f_node *)0) 
[ 
return DE_FILENOTFND; 
) 
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dta (a remnant of the CP/M compatibility rules). Next dos_findnext() 
allocates an f_node if possible and returns a “File not found” error if it 
cannot. dos_findnext() next selects the drive parameter block entry 
that corresponds to the drive specified in the dirmatch structure. 
dos_findnext() then performs a media check to guarantee that the 
user did not change the disk between the calls and returns an error if 
this did happen. 


Listing 4.21 Source code for dos_findfirst() and 
dos_findnext() functions — continued. 


/* Force the f_node into read-write mode 

fnp -> f_mode = RDWR; 

/* Select the default to help non-drive specified path 
/* searches... 

fnp -> f_dpb = &bik_devices([dmp -> dm_drive]; 

++(fnp -> f_dpb) -> dpb_count; 


if(media_check(fnp -> f_dpb) < 0) 
{ 
--(fnp -> f_dpb) -> dpb_count; 
release_f_node(fnp); 
return DE_FILENOTFND; 
} 


fnp -> f_dsize = DIRENT_SIZE * (fnp -> f_dpb) -> dpb_dirents; 
/* Search through the directory to find the entry, but do */ 
/* a seek first. mh. 
if(dmp -> dm_entry > 0) 

fnp -> f_diroff = (dmp -> dm_entry - 1) * DIRENT_SIZE; 
fnp -> f_offset = fnp -> f_highwater = fnp -> f_diroff; 


fnp -> f_cluster = dmp -> dm_cluster; 
fnp -> f_dirstart = dmp -> dm_cluster; 
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Listing 4.21 Source code for dos_findfirst()and 
dos_findnext() functions — continued. 


/* Loop through the directory 
while(dir_read(fnp) = DIRENT_SIZE) 
( 
++dmp -> dm_entry; 
if(fnp -> f_dir.dir_name[0] != '\0' && fnp -> 
f_dir.dir_name(0] != DELETED) 
{ 
if (fcmp_wild((BYTE FAR *)(dmp -> dm_name_pat), 
(BYTE FAR *)fnp -> f_dir.dir_name, 
FNAME_SIZE+FEXT_SIZE)) 


/* Test the attribute as the final step*/ 
if(fnp -> f_dir.dir_attrib & D_VOLID) 
continue; 
else if( 
((~(dmp -> dm_attr_srch | D_ARCHIVE | D_RDONLY) 
& fnp -> f_dir.dir_attrib) 
& (D_DIR | D_SYSTEM | D_HIDDEN)) = 0) 
( 
found = TRUE; 
break; 
J 
else 
continue; 


/* If found, transfer it to the dmatch structure 
if (found) 
pop_dmp(dmp, fnp); 


/* return the result 
--(fnp -> f_dpb) -> dpb_count; 
release_f_node( fnp); 


return found ? SUCCESS : DE_FILENOTFND; 
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With all these preliminaries out of the way, dos_findnext() gets 
down to business and starts to search through the directory to find the 
entry. First it recovers information saved in the dirmatch structure by 
dos_findfirst() and does a seek to the directory record that corre- 
sponds to the next record to be read. dos_findnext() falls into a 
while loop whose exit criterion is an end-of-file on the directory. As it 
loops through the directory, it performs a wildcard match on the name 
field of the entry if it is not a deleted entry. If a match occurs, it tests the 
attribute as the final step because a match condition requires both name 
and attribute matches. If an entry meets both criteria, dos_findnext() 
transfers the entry into the dmatch structure and returns the resources. 
Finally, it returns a success or file-not-found error based on the result of 
the search. This satisfies MS-DOS requirements and completes the first 
look at DOS support functions. 


Listing 4.22. Source code for dos_open() function. 


COUNT 
dos_open (BYTE FAR *path, COUNT flag) 
{ 
REG struct f_node *fnp; 
COUNT i; 
BYTE FAR *fnamep; 
BYTE dname[NAMEMAX ] ; 
BYTE fname({FNAME_SIZE], fextLFEXT_SIZE]; 


/* First test the flag to see if the user has passed a 
/* valid file mode... 
if(flag < 0 || flag > 2) 

return DE_INVLDACC; 


/* first split the passed dir into comopnents (i.e. - path 
/* to new directory and name of new directory. 
if((fnp = split_path(path, dname, fname, fext)) == NULL) 
( 
dir_close(fnp); 
return DE_PATHNOTFND; 
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Opening a File 


If you have followed the directory functions so far, most of what you 
will see in the file handlers will look very familiar. The DOS-C design 
reuses code and algorithms wherever possible, facilitated through the 
use of the f_node data structure common to all the file system func- 
tions. I will examine dos_open() to study how a file access works 
(Listing 4.22). The reader can look at other functions such as 
dos_create( ) because of the similarity in design. 


Listing 4.22 Source code for dos_open() function — 
continued. 


/* Look for the file. If we can't find it, just return a 
/* not found error. 
if(!find_fname(fnp, fname, fext)) 
( 
dir_close(fnp); 
return DE_FILENOTFND; 
) 


/* Set the f_node to the desired mode 
fnp -> f_mode = flag; 


/* Initialize the rest of the f_node. 
fnp -> f_offset = 01; 
fnp -> f_highwater = fnp -> f_dir.dir_size; 


fnp -> f_back = LONG_LAST_CLUSTER; 
fnp -> f_cluster = fnp -> f_dir.dir_start; 


fnp -> f_flags.f_dmod = FALSE; 
fnp -> f_flags.f_dnew = FALSE; 
fnp -> f_flags.f_ddir = FALSE; 


return x1lt_fnp(fnp); 
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The algorithm for dos_open( ) is similar to that of dir_open( ). The 
only changes are that a file also has a mode associated with it and that it 
must return a number that will be later used to derive the handle 
returned to the user. 

dos_open() begins by first testing the flag to see if the user has 
passed a valid file mode. This type of testing is necessary for all of the 
DOS support functions. An application can easily pass erroneous infor- 
mation and DOS-C must handle this in a sane way. Crashing because of 
bad parameters passed to a call is definitely not an option. 

Once parameter testing is complete, the path passed in the call needs 
to be validated. dos_open( ) splits the path into components (1.e., path 
to directory, filename, and file extension) and tests for a valid path. This 
information is needed to locate the file. If the file can’t be found, 
dos_open() returns a file-not-found error. If the file is found, 
dos_open( ) proceeds to set the f_node to the desired mode and initial- 
izes the rest of the f_node. It finally returns through x1t_fnp(), which 
translates an f_node pointer to a numeric index for later use. 

With f_node initialized, any file operation can take place, and 
f__node retains the state of the file. It is this data structure that contains 
the directory image, current size, pointer to the next byte to be read, etc. 


Reading/Writing a File 


Reading and writing a file is a large part of the DOS-C file system 
code. This code is used for loading programs as well as file access 
to user applications. DOS-C provides two functions, dos_read( ) 
and dos_write() for DOS support functions, but a single function, 
rdwrblock( ), does all the work (Listing 4.23). Both dos_read() and 
dos_write() call rdwrblock() with XFR_READ to read a file and 
XFR_WRITE to write a file. 

rdwrblock() is common to both functions because of the similarity 
between file read and write operations. In both cases, data is transferred 
with only data direction differences. The block cache mechanism sim- 
plifies matters by providing automatic mechanisms for writing dirty 
buffers. Hence, data is transferred either from memory to a buffer or 
vice versa. I will note other small differences along the way. 
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As you have seen before, rdwrblock() starts off by translating the 
internal handle into an f_node pointer because all internal file system 
Operations use f_nodes to maintain the state of the file. Part of the 
translation requires testing the validity of the file. For example, the 
application may erroneously close the file and attempt to read from it or 


Listing 4.23 Source code for rdwrblock() function. 


UCOUNT 
rdwrblock (COUNT fd, VOID FAR *buffer, UCOUNT count, COUNT mode, 
COUNT *err) 
( 
REG struct f_node *fnp; 
REG struct buffer FAR *bp; 
UCOUNT xfr_cnt = 0, ret_cnt = 0; 
LONG idx; 
WORD secsize; 
UCOUNT to_xfer = count; 


#Hifdef DEBUG 
if (bDumpRdWrParms ) 
[ 
printf("rdwrblock: mode = %s\n", 
mode = XFR_WRITE ? “WRITE” : “READ"); 
printf(" fd buffer COUNT\N, ==) eoes4 * Oeste \n"); 
printf(" 202d %04x:%304x &%d\n", fd, (COUNT)FP_SEG(buffer), 
(COUNT )FP_OFF(buffer), count); 
j 


fendif 
/* Translate the fd into an f_node pointer, since all xy 
/* internal operations are achieved through f_nodes. oy 


fnp = xlt_fd(fd); 


/* If the fd was invalid because it was out of range or the */ 
/* requested file was not open, tell the caller and exit os 
/* note: an invalid fd is indicated by a O return *} 
if(fnp = (struct f_node *)0 || fnp -> f_count <= 0) 
( 

*err = DE_INVLDHNOL; 

return 0; 
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write to it. If the internal handle is invalid because it is out of range or 
the requested file is not open, rdwrblock() notifies the calling DOS 
support function of the error and exits. Next, rdwrblock() tests that 
the requested number of bytes for data transfer is not zero. If the count 
is zero and the mode is XFR_READ, rdwrblock() just exits because a 
read with a count of zero is a valid call but does not require any further 
processing. Additionally, a write with a count of zero is a special case 
that sets the file length to the current length. Another test is to check for 
a seek past end-of-file on an XFR_READ operation. This is also a valid 
read operation that must return O bytes transferred. Again, it is easy to 
test this up front and avoid further processing. 


Listing 4.23 Source code for rdwrblock() function — 
continued. 


Test that we are really about to do a data transfer. If 
the count is zero and the mode is XFR_READ, just exit. 
(Any read with a count of zero iS a nop). 


A write (mode is XFR_WRITE) is a special case, it sets 
the file length to the current length (truncates it). 


NOTE: doing this up front saves a lot of headaches later. 
if(count = 0) 
( 
if(mode = XFR_WRITE) 
fnp -> f_highwater = fnp -> f_offset; 
{ 
*x*err = SUCCESS; 
return 0; 


} 


/* Another test is to check for a seek past EOF on an 
/* XFR_READ operation. 
if(mode = XFR_READ 
&& !fnp -> f_flags.f_ddir 
&& (fnp -> f_offset >= fnp -> f_dir.dir_size)) 
{ 


*err = SUCCESS: 
return 0: 
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With the special cases out of the way, you begin testing that the 
operation is valid. First, rdwrb]ock( ) tests that the user call passed a 
valid mode for this f_node. This consists of testing combinations of 
read-only and read-wnte in the case of read operations and write-only 
and read-write for write operations. If these two combinations do not 
exist, then an invalid access error is retumed. 

Next, rdwrb]ock() adjusts the far pointer from user space to supervi- 
sor space with a call to the memory management ad just_far() function. 


Listing 4.23 Source code for rdwrblock() function — 
continued. 


/* test that we have a valid mode for this f_node 
switch(mode) 
{ 
case XFR_READ: 
if(fnp -> f_mode != RDONLY && fnp -> f_mode != ROWR) 
{ 
*err = DE_INVLDACC; 
return 0; 
} 
break; 


fHifndef IPL 
case XFR_WRITE: 
if(fnp -> f_mode != WRONLY && fnp -> f_mode != RDWR) 
{ 
*err = DE_INVLDACC; 
return 0; 
} 
break; 
fendi f 
default: 
*err = DE_INVLDACC; 
return D; 
} 


/* The variable secsize will be used later. */ 
secsize = fnp -> f_dpb -> dpb_secsize; 


/* Adjust the far pointer from user space tp supervisor space*/ 
buffer = adjust_far((VOID FAR *)buffer); 
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Although DOS-C is real mode only, the call is still necessary because of the 
segment architecture. Adjust the segment and offset so that you have a 
usable range for the transfer. Once the pointer has been adjusted, proceed 
to transfer the data. rdwrblock() uses a block transfer method so that 
later versions of DOS-C can use memory management. 


Listing 4.23 Source code for rdwrb]ock() function — 
continued. 


/* Do the data transfer. Use block transfer methods so 
/* that we can utilize memory management in more complex 
/* DOS-C versions. 
while(ret_cnt < count) 
( 
/* Position the file to the f_node’s pointer position. 
/* This is done by updating the f_node's cluster, block 
/* (sector) and byte offset so that read or write 
/* becomes a simple data move into or out of the block 
/* data buffer. 
if(fnp -> f_offset = 01) 
( 
fHifndef IPL 
/* For the write case, a newly created file 
/* will have a start cluster of FREE. If we're 
/* doing a write and this is the first time 
/* through, allocate a new cluster to the file. 
if( (mode = XFR_WRITE) 
&& (fnp -> f_dir.dir_start — FREE)) 
if(!first_fat(fnp) ) 
( 
dir_close(fnp); 
*err = DE_HNDLDSKFULL; 
return ret_cnt; 
} 
endif 


/* complete the common operations of 

/* initializing to the starting cluster and 
/* setting all offsets to zero. 

fnp -> f_cluster = fnp -> f_dir.dir_start; 
fnp -> f_back = LONG_LAST_CLUSTER; 

fnp -> f_sector = 0; 

fnp -> f_boff = 0; 
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rdwrblock() uses a while loop that exits when the bytes trans- 
ferred equal the requested amount. This guarantees that the correct 
number of bytes is transferred. Next, the loop positions the file to the 
pointer position of the f_node. This is done by updating the cluster, 
block (sector), and byte offset of the f_node so that read or wnite 
becomes a simple data move into or out of the block data buffer. There 


Listing 4.23 Source code for rdwrblock() function — 
continued. 


/* The more difficult scenario is the (more common) 
/* file offset case. Here, we need to take the f_node's 
/* offset pointer (f_offset) and translate it into a 
/* relative cluster position, cluster block (sector) 
/* offset (f_sector) and byte offset (f_boff). Once we 
/* have this information, we need to translate the 
/* relative cluster position into an absolute cluster 
/* position (f_cluster). This is unfortunate because 
/* it requires a linear search through the file’s FAT 
/* entries. It made sense when DOS was originally 
/* designed as a simple floppy disk operating system 
/* where the FAT was contained in core, but now 
/* requires a search through the FAT blocks. 
/* 
/* The algorithm in this function takes advantage of 
/* the blockio block buffering scheme to simplify the 
/* task. 
else 

switch(map_cluster(fnp, mode) ) 

[ 

case DE_SEEK: 

dir_close(fnp); 
return ret_cnt; 


default: 
dir_close(fnp); 
*err = DE_HNDLDSKFULL; 
return ret_cnt; 


case SUCCESS: 
break; 
} 
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iS a special case that must be examined. For write operations, a newly 
created file will have a start cluster of FREE. Check for this condition 
because if this is the case, you must allocate a new cluster to the file. 
Then rdwrblock() completes the common operations of initializing to 
the starting cluster and setting all offsets to zero. 


Listing 4.23 Source code for rdwrblock() function — 
continued. 


fHifndef IPL 
/* XFR_LWRITE case only - if we're at the end, the next */ 
/* FAT is an EOF marker, so just extend the file length */ 
if(mode = XFR_WRITE && last_link(fnp)) 
if (!extend(fnp) ) 
{ 
dir_close(fnp); 
*err = DE_HNDLDSKFULL; 
return ret_cnt; 


#tendi f 
/* Compute the block within the cluster and the offset */ 


/* within the block. */. 
fnp -> f_sector = 


(fnp -> f_offset / secsize) & fnp -> f_dpb -> dpb_clsmask; 
fnp -> f_boff = fnp -> f_offset % secsize; 


#Hifdef OSK_DEBUG 
printf("%d links; dir offset %ld, starting at cluster %d\n", 
fnp -> f_count, 
fnp -> f_diroff, 
fnp -> f_cluster); 
{fendif 


/* Do an EOF test and return whatever was transferred 
/* but only for regular files in XFR_READ mode 
if ((mode = XFR_READ) && !(fnp -> f_flags.f_ddir) 
&& (fnp -> f_offset >= fnp -> f_dir.dir_size)) 
{ 
*err = SUCCESS; 
return ret_cnt; 
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Listing 4.23 Source code for rdwrblock() function — 
continued. 


/* Get the block we need from cache*/ 
bp = getblock((LONG)clus2phys(fnp -> f_cluster, 
fnp -> f_dpb -> dpb_clssize, 
fnp -> f_dpb -> dpb_data) + fnp -> f_sector, 
fnp -> f_dpb -> dpb_unit); 
if(bp = (struct buffer *)0) 
( 
*err = DE_BLKINVLD; 
return ret_cnt; 


/* transfer a block 
/* Transfer size as either a full block size, or the 
/* requested transfer size, whichever is smaller. 
/* Then compare to what is left. since we can transfer 
/* a maximum of what is left. 
swi tch(mode ) 
{ 
case XFR_READ: 
if(fnp -> f_flags.f_ddir) 
xfr_cnt = min(to_xfer, secsize - fnp -> f_boff); 
else 
xfr_cnt = min(min(to_xfer, secsize - fnp -> f_boff), 
fnp -> f_dir.dir_size - fnp -> 
f_offset); 
fbcopy( (BYTE FAR *)&bp -> b_buffer(fnp -> f_boff]. 
buffer, xfr_cnt); 
break; 


#ifndef IPL 
case XFR_WRITE: 
xfr_cnt = min(to_xfer, secsize - fnp -> f_boff); 
fbcopy (buffer, (BYTE FAR *)&bp -> b_buffer(fnp -> f_boff], 
xfr_cnt); 
bp -> b_update = TRUE; 
break; 
ifendif 


default: 
*xerr = DE_INVLDACC; 
return ret_cnt; 
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The more common scenario is the common file offset case. Here, you 
need to take the offset pointer of the f_node (f_offset) and translate it 
into a relative cluster position, cluster block (sector), offset (f_sector), 
and byte offset (f_boff). Once you have this information, you need to 
translate the relative cluster position into an absolute cluster position 
(f_cluster). This is unfortunate because it requires a linear search 


Listing 4.23 Source code for rdwrblock() function — 
continued. 


/* update pointers and counters 
ret_cnt += xfr_cnt; 
to_xfer -= xfr_cnt; 
fnp -> f_offset += xfr_cnt; 
buffer = add_far((VOID FAR *)buffer, (ULONG)xfr_cnt); 
if(mode = XFR_WRITE && 
(fnp -> f_offset > fnp -> f_highwater) ) 
fnp -> f_highwater = fnp -> f_offset; 
*err = SUCCESS; 
return ret_cnt; 
| 


COUNT 
dos_read (COUNT fd. VOID FAR *buffer, UCOUNT count ) 
( 

COUNT err, xfr; 


xfr = rdwrblock(fd, buffer, count, XFR_READ, &err); 
return err '= SUCCESS ? err : xfr; 


ifndef IPL 
COUNT 
dos_write (COUNT fd, VOID FAR *buffer, UCOUNT count) 
| 
COUNT err, xfr; 


xfr = rdwrblock(fd. buffer. count, XFR_WRITE, &err); 
return err != SUCCESS ? err : xfr; 

J 

ffendif 
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through the file’s FAT entries. This scenario made sense when DOS was 
originally designed as a simple floppy disk operating system where the 
FAT was contained in core, but a search through the FAT blocks is 
required. The algorithm used by rdwrblock() takes advantage of the 
blockio block buffering scheme to simplify the task. Because the FAT 
is contained within a block, the block may remain within the cache if 
there are enough buffers. This minimizes seek and rotational latencies. 

After rdwrblock() determines the physical addresses, it performs 
another test for a special case. In the write case only, rdwrblock() must 
check for end-of-file before the data transfer can take place. You need to 
extend the file by allocating another cluster before you can transfer data 
to it. For this reason, the test must occur before the transfer. If the file is 
at the end, the next FAT is an EOF marker and rdwrblock() just 
extends the file’s cluster chain. 

For read operations, rdwrblock() does an EOF test for every loop 
iteration. You need to test for end-of-file because it is a valid termina- 
tion condition. If you encounter this situation, you retum a count of 
whatever was transferred for regular files in read mode. 

If it was not an end-of-file, rdwrblock() proceeds to get the block 
you need from cache and transfer data to or from the block. Transfer 
size is either a full block size, or the requested transfer size, whichever 
is smaller. When this transfer is complete, rdwrblock( ) compares its 
count to what is left, since you must transfer what is left. rdwrblock( ) 
updates pointers and counters and repeats the loop. It returns the num- 
ber of bytes transferred when the loop terminates. 


Closing a File 


Much of the dos_close() algorithm (Listing 4.24) will seem very 
familiar if you have studied the dir_close() function (Listing 4.20). 
Like the other file system functions, dos_close() first translates the 
internal handle into a useful pointer. If the corresponding f_node is 
invalid because the internal handle is out of range or the requested file 
is not open, dos_close() simply returns an error. It then looks at the 
file mode. If the file is not read-only, a write may have taken place. 
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dos_close() updates the directory size and sets a flag that will cause 
the directory record to be written. It then calls dir_close() to update 
the directory, if necessary, and exits. 


More Functions 


By now, you have looked at file system operations that perform data 
transfer in one form or another. The file system functions also contain 
other utilitarian functions that retum file sizes, file creation or modifica- 
tion dates, times, etc. The functions are all built from building blocks 
similar to the data transfer functions and can be easily understood if 
you have understood the explanations of the data transfer functions. 
MS-DOS is feature-rich with utility functions; it will be useful to study 
the code in these files. 









Listing 4.24 Source code for dos_close() function. 


COUNT 
dos_close (COUNT fd) 
( 
struct f_node *fnp; 


/* Translate the fd into a useful pointer */ 
fnp = xlt_fd(fd); 


/* If the fd was invalid because it was out of range or the 
/* requested file was not open, tell the caller and exit *] 
/* note: an invalid fd is indicated by a 0 return a 
if(fnp = (struct f_node *)0 || fnp -> f_icount <= 0) 

return DE_INVLDHNDL; 
if(fnp -> f_mode != RDONLY) 
( 













fnp -> f_dir.dir_size = fnp -> f_highwater; 
fnp -> f_flags.f_dmod = TRUE; 

} 

fnp -> f_flags.f_ddir = TRUE; 







dir_close(fnp); 
return SUCCESS; 


Chapter 5 


DOS-C Kernel: 
Memory Manager 
and Task Manager 


In this chapter, I will examine two resource managers: the memory 
manager and the task manager. Both of these managers serve an impor- 
tant role in the operations of DOS-C (Figure 5.1). Each manager pro- 
vides system call services that are part of the MS-DOS collection of 
system services. They also provide key internal services necessary to 
the other DOS-C services. One note: although the code is written to be 
portable, portions of these managers, especially the task manager, con- 
tain 80x86-specific code. 
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In DOS-C, the memory manager provides and maintains the memory 
allocation of DOS-C. Unlike other operating systems, the memory man- 
ager does not provide protection to the memory it manages. It cannot, 
because DOS-C is designed to work in real mode so that it runs on XT 
systems, which are powered with the 8088 processor. Memory protec- 
tion mechanisms are not available in the low-end 8088 and in the real 


Figure 5.1 Memory and task resource managers functionality layer. 
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mode of 80286 and newer processors. However, the algorithms used can 
easily be adapted (and have been in related operating systems) to pro- 
vide the underlying allocation mechanism for protected-mode versions. 

The task manager is also simplified when compared to task manag- 
ers of other operating systems. MS-DOS is a simple operating system 
that can only perform a single task at a time. Because DOS-C is pat- 
lerned after it, DOS-C also provides support for a single task at a time. 
Its primary responsibility is to act as a task loader and perform a task 
switch from kernel mode to user mode and vice versa. Again, the code 
presented here is derived from more sophisticated multitasking code, 
and the elements necessary for multitasking are contained in this code. 
However, what is sorely missing is a scheduler, but one can be easily 
introduced if you are inclined to do so. 


Arena Management 


In MS-DOS, the memory area is known as the arena. The arena is a col- 
lection of one or more areas of memory that contain a data structure 
followed by available memory. DOS-C uses the same data structures as 
MS-DOS to maintain the memory arena. This is necessary, because 
many programs examine the memory header when they want to walk 
the memory chain to find other programs that may reside in memory. In 
faci, much of the “undocumented” literature is devoted to explaining 
the memory header and presents special programs to dump the header. 
Because the goal of DOS-C is to be compatible with MS-DOS, the 
memory header is identical. 

Like MS-DOS, DOS-C arranges the memory arena as a collection of 
variable-length blocks. Each block has a Memory Control Block (MCB) 
defined by the mcb data structure (Listing 5.1). This header starts with a 
single byte that is ASCII “M” except in the last block which starts with 
an ASCII “Z”. All memory blocks must have an owner, and DOS-C 
identifies the owner by assigning the segment of the program segment 
prefix of the owner. This entry is the m_pSp member. Note that the use of 
the segment is a direct 80x86 architecture derivative. 
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Another 80x86-dependent entry is the size member, m_size. This 
member contains the size of the memory block in 16-byte paragraphs that 
match the 80x86 paragraph size. The user gets the memory block starting 
at the next byte after the MCB header, which falls, coincidentally, on a 
paragraph boundary. This field is also instrumental in tracing the MCB 
chain. If you want to determine where the next MCB header will be, take 
the address of the block and add m_s 7ze to it. The next MCB block starts 
at this address. Of course you must perform all computations in 80x86 
segment:offset fashion. Unfortunately, this is not very portable, but porta- 
bility must be sacrificed for compatibility purposes. However, the code is 
written in a portable manner and the use of any other granularity such as 
memory pages can be substituted. For example, an 80386 or 68040 ver- 
sion can use 4Kb pages that correspond to MMU pages. 

The last MCB member is m_name. In a departure from standard 
MS-DOS, DOS-C places the filename of the owner in this field. This 
allows easy owner identification of the block when programs that walk 
the MCB chain dump the header contents. DOS-C, unlike MS-DOS, 
also clears this field when the MCB becomes available. 

Another member is m_fil]. This odd little unused portion of the 
MCB header is a fix that brings the size of the header to fit exactly 
within one paragraph. Again, because of MS-DOS’ strong root in the 
80x86 world, assembly language arithmetic is smaller and faster if you 
can ignore the offset and work with the segment only. Microsoft chose 
to do this and sacrificed 3 bytes to simplify the math. This 1s a good 
software engineering tradeoff that results in cleaner code. 


Listing 5.1 The mcb data structure. 


typedef struct 
{ 
BYTE m_type; /* mcb ype - chain or end 
UWORD m_psp; /* owner id via psp segment 
UWORD m_size; /* size of segment in paragraphs 
BYTE m_fill1[3];: 
BYTE m_name[8]; /* owner name limited to 8 bytes 
} mcb; 
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Allocation Strategy Management 


DOS-C provides five functions to manage memory. These functions 
provide allocation, deallocation, size modification, and validation func- 
tions. However, not all functions are available to the user. MS-DOS and 
DOS-C provide system calls to allocate, change size, and return mem- 
ory to DOS-C. The other functions provide DOS-C with the ability to 
find a block of memory large enough to load a program and to verify 
that the arena 1s intact. Unfortunately, DOS-C needs this functionality 
because any program could destroy the arena or DOS-C itself because 
it lacks memory protection schemes. 


Memory Allocation 


For memory allocation, DOS-C provides the function DosMemA1 loc() 
(Listing 5.2), which can allocate memory using three different criteria: 
first fit, best fit, and last fit. It can also return the largest memory block 
available for program loading (Figure 5.2). The algorithm used is 
straightforward — search each memory block until it encounters the 
last one or meets an exit criteria. If a block matches the criteria of the 
search, DosMemAl1loc() sets a local variable TRUE and terminates the 
loop, unless it determines that it encountered an end block after the test, 
in which case it returns an error. 


Listing 5.2 The DosMemA11oc() function. 


COUNT 
DosMemAlloc (int size, COUNT mode, seg FAR *para, UWORD FAR *asize) 
{ 

REG mcb FAR *p; 

mcb FAR *q; 

COUNT i; 

BOOL found; 


/* Initialize 
p = (mcb FAR *)(MK_FP(first_mcb, 0)); 
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For each iteration, DosMemAl1oc() checks for a corrupted memory 
block. The test it performs simply looks at the m_type member to 
determine whether the block is valid. If it is, neither an ASCII ““M” nor 
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Listing 5.2 The DosMemA1 1 oc () function — continued. 


/* Search through memory blocks */ 
for(q = (mcb FAR *)0, i = 0, found = FALSE; !found; ) 
{ 
/* check for corruption */ 
if(p -> m_type != MCB_NORMAL && p -> m_type != MCB_LAST) 
return DE_MCBDESTRY; 


/* Test if free Sased on mode rules 
swi tch(mode ) 
{ 
case LAST_FIT: 
default: 
/* Check for a last fit candidate 
if(p -> msize >= size && p -> mpsp = FREE_PSP) 
/* keep the last know fit 
q =); 
/* not free - bump the pointer 
if(p -> m_type != MCB_LAST) 
p = MK_FP(far2para((VOID FAR *)p) + p -> m_size + 
1. 0); 
/* was there no room (q = 0)? */ 
else if(p -> m_type == MCB_LAST && q (mcb FAR *)0) 
return DE_NOMEM; 
/* something was found - continue «| 
else 
found = TRUE: 
break; 


case FIRST_FIT: 
/* Check for a first fit candidate 
if(p -> m_size >= size && p -> m_psp FREE_PSP) 
( 


q =); 
found = TRUE; 
break; 
} 
/* not free - bump the pointer */ 
if(p -> m_type != MCB_LAST) 
p = MK_FP(far2para((VOID FAR *)p) + p -> m_size + 
1, 0); 
/* nothing found till end - no room */ 
else 
return DE_NOMEM; 
break; 
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an ASCII “*Z”, DosMemA1 1 oc ( ) terminates and returns an error. If it is a 
valid memory block, DosMemA1 loc() tests to see if it is free, and if so, 
proceeds to test it based on mode rules. The algorithm then diverges at 
this point when it tests the block. A switch statement splits the execu- 
tion path based on the requested mode. 

DosMemA11oc() tests for last fit by searching through each mem- 
ory control block. If the size of the block is greater than or equal to 
the requested size, DosMemAl1oc() notes the address of the block. If 
DosMemA] 1oc( ) examined the block marked with the end of the mem- 
ory arena, it terminates and returns an error; otherwise, it proceeds to 
common completion code. 


Listing 5.2 The DosMemAl1oc() function — continued. 


case BEST_FIT: 
/* Check for a best fit candidate 
if(p -> msize >= size && p -> m_psp = FREE_PSP) 
{ 
if(i =0 || p -> msize < i) 
{ 
j p -> msize; 
q =P; 
} 
} 
/* not free - bump the pointer *] 
if(p -> mtype != MCB_LAST) 
p = MK_FP(far2para((VOID FAR *)p) + p -> m_size + 
1, 0); 
/* was there no room (q = 0)? */ 
else if(p -> m_type = MCB_LAST && q = (mcb FAR *)0) 
return DE_NOMEM; 
/* something was found - continue */ 
else 
found = TRUE; 
break; 
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Listing 5.2 The DosMemA11oc() function — continued. 


case LARGEST: 
/* Check for a first fit candidate */ 
if((p -> mpsp = FREE_PSP) && (i = 0 || p -> msize > i)) 
{ 
size = *asize = i = p -> m-size; 
q =P; 
} 
/* not free - bump the pointer */ 
if(p -> m_type != MCB_LAST) 
p = MK_FP(far2para((VOID FAR *)p) + p -> m_size + 
1, 0); 
/* was there no room (q = 0)? */ 
else if(p -> m_type = MCB_LAST && q = (mcb FAR *)0) 
return DE_NOMEM; 


/* something was found - continue */ 
else 
found = TRUE; 
break; 
} 

} 
p=q; 
/* Larger fit case */ 


if(p -> msize > size) 
{ 
if(mode != LAST_FIT) 
{ 
= MK_FP(far2para((VOID FAR *)p) + size + 1, 0); 
* Always flow m_type up on alloc */ 
-> m_type = p -> m_type; 
-> m_type = MCB_NORMAL; 
-> M_psp = cu_psp; 
-> m_psp = FREE_PSP; 
-> msize = p -> msize - size - 1; 
->m ee = size; 
or(i = 0; i < B; i++) 
p -> mname([i] = q -> mname[i] = ‘\0'; 


eo ee > o 
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DosMemA11oc() tests for first fit in a way similar to the last fit case. 
It also searches through each memory control block. If the size of the 
block is greater than or equal to the requested size, the address of the 
block is noted. However, unlike last fit, it exits the loop and proceeds to 
common completion code. If DosMemAlloc() reaches the end of the 
memory arena, it also terminates and returns an error. 


Listing 5.2 The DosMemA?1 0c () function — continued. 


else 
{ 
q = MK_FP(far2para((VOIO FAR *)p) + 
(p -> msize - size), 0): 
* Always flow m_type up on alloc */ 
> m_type = p -> m_type; 
> m_type = MCB_NORMAL; 
-> M_psp = cu_psp; 
> m_psp = FREE_PSP; 
> m_size =p -> msize - size - 1; 
> m_size = size; 
(i = 0: i < 8; iH) 
p -> mname(i] = q -> m_name[i] = '\0': 
} 
/* Found - return good */ 
*para = far2para((VOID FAR *)(mode = LAST_FIT ? 
(VOID FAR *)q : (VOID FAR *)p)); 
return SUCCESS; 
} 
/* Exact fit case */ 
else if(p -> m_size = size) 
{ 
p -> mM_psp = Cu_psp: 
for(i = 0; i < 8; i++) 
p -> mnameLi] = ‘\0'; 
/* Found - return good */ 
*para = far2para((VOID FAR *)(BYTE FAR *)p); 
return SUCCESS; 
} 
else 
return DE_MCBDESTRY; 
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The best fit test uses a local variable to maintain the size of the block 
closest to the requested size. In a manner similar to the last fit case, 
DosMemA 11 0c( ) compares the requested size to the block size. If the size 
of the block is greater than or equal to the requested size, DosMemA1 1oc( ) 
then proceeds to test the block size. If it is smaller than the last block 
that was greater than the requested size, DosMemAlloc() notes the 
address of the block. In this way, it determines the smallest block that 
will satisfy the request. If DosMemA11o0c() reaches the end of the mem- 
ory arena, it terminates .ind returns an error; otherwise, it proceeds to 
common completion code. 

The test for the largest memory block is the simplest case. DosMemA110c( ) 
simply determines which memory block is the largest and notes the 
address of this block. In this way, DosMemAlloc() determines the 
smallest block that will satisfy the request. If it does not find a block, it 
terminates and returns an error; otherwise, it proceeds to common com- 
pletion code. 

DosMemA11o0c() completes the search by modifying the block that it 
found. If the block is larger than requested, DosMemA110c( ) proceeds to 
carve out the size of block requested from the block found. If, however, 
the block found is identical in size to that requested, DosMemAl 10c( ) 
simply reassigns the block. With all arena maintenance out of the way, 
DosMemA110c() exits by returning the segment of paragraph following 
the MCB. This segment forms the address of the first byte of allocated 
memory that the user may utilize. 


Memory Deallocation 

DOS-C provides a complement to DosMemAl loc() with DosMemFree( ) 
(Listing 5.3). DosMemFree( ) takes the segment acquired from DosMemA1 loc ( ) 
and retums it to the free pool. In comparison, DosMemFree( ) is much sim- 
pler because there is only one way to return a memory block, whereas 
there are four ways to allocate one. Its job is a simple one: check that 
the block is valid and combine it with adjacent blocks if possible. 
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Listing 5.3 The DosMemFree() function. 


COUNT 
DosMemFree (int para) 


{ 


REG mcb FAR *p, FAR *q; 
COUNT i; 


/* Initialize 
p = (mcb FAR *}(Mk Fripara, 0)); 


/* check for corruption 
if(p -> m_type != MCB_NORMAL && p -> m_type != MCB_LAST) 
return DE_INVLDMCB; 


/* Mark the mcb as free so that we can later 
/* merge with other surrounding free mcb's 
p -> m_psp = FREE_PSP; 
for(i = 0; i < B; i++) 
p -> m.name[i} = '\0'; 


/* Now merge free blocks 

for(p = (mcb FAR *)(MK_FP(first_mcb, 0)); p -> m_type 
'= MCB_LAST; p = q) 

( 


/* make q a pointer to the next block 
q = MK_FP(far2para((VOID FAR *)p) + p -> msize + 1, 0); 
/* and test for corruption 
if(q -> m.type != MCB_NORMAL && q -> m_type != MCB_LAST) 
return DE_MCBDESTRY; 
if(p -> mpsp != FREE_PSP) 
continue; 


/* test if next is free - if so merge 
if(q -> mpsp = FREE_PSP) 
{ 
/* Always flow type down on free 
p -> mtype =q -> m_type; 
p -> msize t= q -> msize + 1; 
/* and make pointers the same since the next 
/* free is now this block 
q=-D; 
) 


} 
return SUCCESS; 
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DosMemFree() first checks for corruption in the same way that 
DosMemAl1oc() does. DosMemFree() then marks the MCB as free so 
that you can later merge it with other surrounding free MCBs. Once 
marked, DosMemA110c() proceeds to merge adjacent free blocks. 

In order to free an MCB, DosMemF ree( ) has to go through each one 
to determine if the previous block and the following block are avail- 
able. It tries to merge all blocks that become available in a belt and sus- 
penders fashion. The algorithm uses a pointer to the next block. 
DosMemFree/ : isodates this pointer on each iteration based on the con- 
tent of the last block and tests for corruption. Again, because of lack of 
memory protection hardware in low-end 80x86 processors, any task 
can corrupt an MCB. 

If the MCB passes the corruption test, DosMemFree() tests the next 
MCB to see if it is free. If it is, it merges the two blocks by first setting 
the m_psp structure member to FREE_PSP. It then takes the type from 
what had been the adjacent free block and copies it into the current free 
block. The algorithm counts on the fact that only the last block is marked 
free; if you merge with the last block, then the current block becomes the 
last block. It then adjusts the size of the current block and exits. 

The next function, DosMemLargest( ) (Listing 5.4), is never seen by 
the user. Its purpose in DOS-C is to assist the task manager in locating 
the largest available free block. Task manager needs this information 
because it follows the model of MS-DOS in its operation. Under certain 
conditions in the process of loading a program, the task manager must 
load the program in the largest available memory. DosMemLargest() is 
called to return this information. 


Listing 5.4 The DosMemLargest() function. 


seg 
DosMemLargest (seg FAR *size) 
( 


REG mcb FAR *p; 
mcb FAR *q; 
COUNT found; 


/* Initialize 
p = (mcb FAR *)(MK_FP(first_mcb, 0)); 
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In a simple fashion, DosMemLargest() searches through each mem- 
ory block to find the largest one. As you have seen before, first it checks 
for and exits if it finds arena corruption. It then tests for the largest block 
available by noting the segment of the current MCB. If it is larger than 
any you have encountered to this point, DosMemLargest( ) exits success- 
fully; otherwise, it returns an error indicating that you are out of memory. 

A user-accessible memory manager function is DosMemChange( ) (List- 
ing 5.5), which changes the size of a given memory block, if possible. It 
can be used to either increase or decrease the size of the block. It can 
also be called without changing the size of the memory block, but that 
is a special case. 


Listing 5.4 The DosMemLargest() function — continued. 


/* Search through memory blocks */ 
for(q = (mcb FAR *)0, *size = 0, found = FALSE; !found; ) 
{ 
/* check for corruption */ 
if(p -> m_type != MCB_NORMAL && p -> m_type != MCB_LAST) 
return DE_MCBDESTRY:; 


/* Test for largest fit/available 
if((p -> mpsp = FREE_PSP) && (p -> m_size > *size)) 
{ 
*size = p -> msize; 
q = Di 
) 
/* not free - bump the pointer */ 
if(p -> mtype != MCB_LAST) 
p = MK_FP( far2para((VOID FAR *)p) + p -> msize + 1, 0); 
/* was there no room (q = 0)? */ 
else if(p -> m_type = MCB_LAST && q = (mcb FAR *)0) 
return DE_NOMEM; 
/* something was found - continue*/ 
else 
found = TRUE; 
) 
if( q !=0) 
return SUCCESS; 
else 
return DE_NOMEM; 
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Listing 5.5 The DosMemChange() function. 


COUNT 
DosMemChange (int para, int size) 
{ 

REG mcb FAR *p, FAR *q; 

REG COUNT i; 


/* Initialize 
p = (mcb FAR *)(Me_rPt--pare, 0)); 


/* check for corruption 
if(p -> m_type != MCB_NORMAL && p -> m_type != MCB_LAST) 
return DE_MCBDESTRY; 


/* check for wrong allocation 
if(size > p -> m_size) 
{ 

REG COUNT delta; 


/* make q a pointer to the next block */ 
q = MK_FP(far2para((VOID FAR *)p) + p -> msize + 1, 0); 


/* if next mcb is not free, error no memory =) 
if(q -> mpsp != FREE_PSP) 
return DE_NOMEM; 


/* reduce the size of q and add difference to p 
/* but check that q is big enough first 
delta = size - p -> msize; 
if(q -> msize < delta) 
return DE_NOMEM; 
q -> msize -= delta; 
p -> m_size += delta; 


/* Now go back and adjust q, we'll make p new q 
p = MK_FP(far2para((VOID FAR *)q) + delta, 0); 
p -> m.type = q -> m_type; 
p -> mMpsp = q -> mpsp; 
p -> m_size = q -> msize; 
for(i = 0; i < 8; i++) 

p -> mname[i] = q -> mnameli]; 


/* and finished 
return SUCCESS; 
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Like the other memory manager functions, DosMemChange( ) starts out 
by verifying that the memory block passed to it is valid. If it is not valid, it 
returns an error. If it is valid, it proceeds to check the three possible cases. 

If the new size is larger than the current size, DosMemChange( ) 
checks the next MCB to see if the block is free. The next memory 
block must be free in order for DosMemChange( ) to allocate memory. 
Unlike other operating systems, DOS-C does not have access to memory 
management hardware to allocate physical memory to logical memory. 
It must take it from the next inemory block. If the next block is not free, 
it returns an error. 


Listing 5.5 The DosMemChange() function — continued. 


/* else, shrink it down 
else if(size < p -> msize) 
{ 
/* make q a pointer to the new next block 
q = MK_FP(far2para((VOID FAR *)p) + size + 1, 0); 


/* reduce the size of p and add difference to q 
q -> mtype = p -> m_type; 

q -> msize = p -> msize - size - 1; 

p -> msiZe = size; 


/* Make certian the old psp is not last (if it was) 
p -> m.type = MCB_NORMAL; 


/* Mark the mcb as free so that we can later 
/* merge with other surrounding free mcb’s 
q -> m_psp = FREE_PSP; 
for(i = 0; i < B; i++) 
q -> m_name[i] = '\0'; 


/* now free it so that we have a complete block 
return DosMemFree( far2para((VOID FAR *)q)); 
) 


/* otherwise, its a no-op 
else 
return SUCCESS; 
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If the next block is free, DosMemChange( ) verifies that it is large 
enough to accommodate the memory change. If so, DosMemChange( ) 
proceeds to reduce the size of the next block and add the difference to 
the current block. Once DosMemChange( ) has reduced the size of the 
next block, it must move the MCB to the new location. With the new 
MCB in place, DosMemChange( ) successfully returns. 

If DosMemChange() is to reduce the size of the memory block, it 
reduces the size of the current block and adds the difference to the next 
black Tt creates a new NCB for the left-over memory and then exits 
through DosMemFree( ), which will merge the freed block, if possible, 
completing the reallocation process. 

The final possible scenario is when the new size is identical to the 
old size. In this case, DosMemChange( ) simply exits. Although there is 
no physical change, DosMemChange( ) does perform a validation of the 
memory block, This is a good way to check the validity of the block. 

DosMemCheck( ) (Listing 5.6) is the final memory manager function I 
will examine. Like DosMemLargest(), it is not accessible to the user. It 
is a DOS-C function that supports the memory manager. Because of the 
critical nature of the MCB headers and the arena itself, DOS-C must 


Listing 5.6 The DosMemCheck() function. 


COUNT 
DosMemCheck (void) 
{ 

REG mcb FAR *p; 


/* Initialize 
p = (mcb FAR *)(MK_FP(first_mcb, 0)): 


/* Search through memory blocks 
for( ; ; ) 
{ 
/* check for corruption 
if(p -> m_type != MCB_NORMAL && p -> m_type != MCB_LAST) 
return DE_MCBDESTRY: 
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check the validity of the arena at key times, such as when a program ter- 
minates. It is the responsibility of DosMemCheck() to guarantee that 
DOS-C does not operate with a corrupted memory arena. 

The DosMemCheck() algorithm is simple. It begins at the beginning 
of the memory arena and examines the MCB header. It checks for cor- 
ruption using the now familiar ASCII “M” or ASCII “Z” check. If the 
header is good, it computes the location of the next block and repeats 
the check. DosMemCheck() continues checking MCB headers until the 
fina! memory black has been checked. 


Arena Support Functions 


Aside from the arena management functions, the memory manager also 
supplies arena support functions. These functions are needed to support 
what may seem to be simple functions but are key to a portable design. 
For example, funtions are needed that extend the addition of scalar val- 
ues to segmented pointers so that data references may extend past the 
64Kb 80x86 boundaries. They are also important to assist in compiler 
independance for 80x86 processors. For example, integer arithmetic 
may wrap in one compiler or memory model and be adjusted in 
another. 


Listing 5.66 The DosMemCheck() function — continued. 


/* not corrupted - if last we're OK! 
if(p -> m_type == MCB_LAST) 
return SUCCESS; 


/* not corrupted - but not end, bump the pointer */ 
else if(p -> m_type != MCB_LAST) 
p = MK_FP(far2para((VOID FAR *)p) + p -> msize +1, 0); 


/* totally lost - bad exit */ 
else 
return DE_MCBDESTRY; 
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mcb_init() is a rather simple function (Listing 5.7) that initializes 
the memory arena. For real mode, this is rather simple. All you need to 
do is create a single memory block that is also the last block. From here 
on out, all the memory management functions maintain the MCB chains. 

DOS-C also provides two support functions that perform a logical to 
physical translation — loosely speaking. Again, you only support real 
mode, and as a result, there really is not much in the way of logical to 
physical support. However, these functions are of considerable help in 
comveriing segmented urchitecture pointers to linear rcpresentations. 
You will see both these functions used throughout the task and memory 
managers. 

The function far2para( ) (Listing 5.8) takes a far pointer and con- 
verts it to a segment address, which is different than simply taking the 
segment portion of the pointer. It computes a linear representation and 
then converts it back (Figure 5.3a). What you are left with is the largest 
segment possible for this address. 

Why do you need this? The segmented architecture of the Intel fam- 
ily allows a pointer 0000: fff0Oh. This address is not convenient for a 
pointer because with most C compilers targeted for Intel processors, 
you can only increment this by 0fh before the pointer wraps around to 
0000: 0000h. Using far2para() for the same pointer, you build a new 
pointer of Of ff :0000h. Now you get a useful range of ffffh before 


Listing 5.7 The mcb_init() function. 


VOID 
mcb_init (mcb FAR *mcbp, int size) 
{ 

COUNT i; 


mcbp - 


> m_type = MCB_LAST; 
mcbp -> 
> 


t 
psp = FREE_PSP; 
mcbp - size = size; 

for(i 


si <8: i+) 
mcbp -> m_nameCi] = '\O'; 
mem_access_mode = FIRST_FIT; 


m 
m 
m 
0 
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the pointer wraps around. Practically, you can guarantee a range of 
fff Oh or 65,520 bytes before the pointer wraps around. 

Function long2para( ) is much simpler but equally necessary (List- 
ing 5.8). In many instances during program loading, a linear logical 
address is computed and long2para() converts this address to the 














Listing 5.8 The far2para() and long2para() functions. 


seg 

far2para (VOID FAR *p) 

{ 
seg ul = FP_SEG(p): 
offset u2 = FP_OFF(p); 
ULONG phy_addr; 


phy_addr = (((long)ul) << 4) + u2; 
return (phy_addr>>4); 
} 








seg 
long2para (LONG size) 
{ 

return ((size + Ox0f)>>4); 
) 


Figure 5.3a far to paragraph operation. 


segment << 4 


f1{olo} 4] + offset 


= linear address 


>>4 


= adjusted paragraph 
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nearest segment (Figure 5.3b). Again, this function is a bridge between 
linear architecture and the segmented architecture of the Intel family. 

Support functions with more complexity are add_far() and 
adjust_far() (Listing 5.9). Again, these functions are bridges 
between linear and segmented architectures. They also assist DOS-C in 
overcoming segmented pointer limitations. 

As I discussed previously, many C compilers for Intel processors 
limit their pointer arithmetic to that which can be addressed within a 
segment. Unfortunately, DOS-C needs to address the entire I Mb of real 
memory range. add_far() addresses this limitation. Both the file sys- 
tem manager and the memory manager need to add a long to a far 
pointer, with the result of a new far pointer. add_far() accomplishes 


Figure 5.36 1ong to paragraph operation. 


2] 2]o}o[ 4] linear address (size) 
+ Fo} ojolo| fF] add to round up 


[ofo]ojolF] 
2] 2fo]1]3| 
2/2 1 


>>4 


12] 2]0{1 = segment 


Listing 5.9 The add_far() and adjust_far() functions. 


VOID FAR * 
add_far (VOID FAR *fp, ULONG off) 
{ 

UWORD seg_val;: 

UWORD off_val; 


/* Break far pointer into components 
seg_val = FP_SEG(fp); 
off_val = FP_OFF(fp); 
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this by breaking a f ar pointer into components and adding the offset to 
the far pointer’s offset part (Figure 5.4a). It then masks the offset so 
that it fits into a paragraph word and adds the top part into the segment. 
This top portion is effectively the carry into the segment. add_far() 
finally converts the individual components to a far pointer and returns 
it (Listing 5.9). 


Listing 5.9 The add_far() ana adjust_far( ) functions — 
continued. 


/* add the offset to the fp's offset part 
off += off_val; 


/* make off_val equal to lower part of new value 
off_val = off & Oxffff; 


/* and add top part into seg 
seg_val += ((off & Ox000f00001) / 0x10); 


/* and send back the new pointer 
return (VOID FAR *)MK_FP(seg_val, off_val); 
} 


VOID FAR * 
adjust_far (VOID FAR *fp) 
{ 

ULONG linear; 

UWORD seg_val; 

UWORD off_val; 


/* First, convert the segmented pointer to a linear address 
linear = (((ULONG)FP_SEG(fp)) << 4) + FP_LOFF(fp); 


/* Break it into segments. 
seg_val = (UWORD)(linear >> 4); 
off_val = (UWORD)(linear & Oxf); 


/* and return an adddress adjusted to the nearest paragraph 
/* boundary. 
return MK_FP(seg_val, off_val); 
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The function adjust_far() is also needed by the file system man- 
ager. It adjusts a far pointer so that you can get the maximum range 
out of C pointer arithmetic (Figure 5.4b), similar to far2para(). It 
does this by converting the segmented pointer to a linear address, 
breaking it into segments by masking, and returning an address 
adjusted to the nearest paragraph (Listing 5.9). 


Figure 54a add_far() operation. 


(2{3foj2} [rfofo7] — farw fp 
f/f] o{0] + of f_val 
[if o] F] 0} 7) 
| 


a 4 
re &Ox f fff split 
+ 
Figure 5.46 Normalized (adjusted) far pointer operation. 
segment 
+ offset 
; 


a) B0:t af = linear address 





off 


| 





final sum 


TL 


Vv 
3 3 fo} 1| Fofo}ol F| split adjusted pointer 
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Task Manager 


The DOS-C task manager is responsible only for starting tasks, switch- 
ing between user space and kernel space, and terminating tasks. In con- 
trast to other operating systems, it is a simple task manager; however, 
its simplicity will assist you in understanding machine support issues, 
such as initial register values and the switching of stack space. 


Assembly Language Support 


I will first examine fundamental assembly language support functions 
necessary for task management in DOS-C. These functions include the 
separation of user and kernel space, process start up, and process termi- 
nation. The DOS-C design calls for a minimum of assembly language, 
so our assembly language functions are straightforward and not very 
complicated. 

The first assembly language function I will examine is exec_user( ) 
(Listing 5.10). This function saves the current state on the kernel stack. 
It then takes the pointer passed to it and recovers the new process con- 
text from it. It uses the Intel iret instruction to recover the cs: ip and 
flags from the stack. This is the simplest way of jumping to a new 
address that remains compatible with normal interrupts. 

You will notice that the processor context is saved on the stack for 
both user and kernel processes. This technique is used throughout 
DOS-C. There are other ways to save a context, such as placing all the 
registers within a table, but that requires much more code than using the 
stack. It does, however, place the constraint that the stack must be large 
enough for the user functions plus the context. Fortunately, MS-DOS 
addresses this in the programming manuals, and all programs written 
for MS-DOS will have a stack large enough to use this technique. 
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The same technique is used by the handle_break( ) function (List- 
ing 5.10). DOS-C calls this function whenever a break has been detected. 
By MS-DOS standards, whenever a break is detected, an int 23h 


Listing 5.10 The exec_user() and hand]le_break() functions. 


Special call for switching processes 


void interrupt far exec_user(irp) 
jregs far *irp; 


Borland C++ specific 
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instruction is executed. This places the necessary cs: ip and flags for 
returning to the system on the user stack. When the user handler 
returns, it checks for an iret or retf return and handles each case 
appropriately. 


Listing 5.10 The exec_user() and handle_break() 
functions — continued. 


public Leyrt gar 
_exec_user Proc far 


push ax 
Push bx 
push cx 
Push dx 
push es 
Push ds 
push si 
push di 
mov ax,OGROUP 
mov ds,ax 
push bp 
mov bp.sp 
cld 

cli 


.WORD PTR (bp+24] ; irp (user ss:sp) 
.WORD PTR (bp+26] 

.ax ; set-up user stack 
.dx 


—exec_user 
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Process Environment 


The assembly language support functions you have just seen lay the 
foundation for the remainder of the task manager. These functions, 
along with memory manager support functions, are used extensively 


Listing 5.10 The exec_user() and handle_break() 
functions — continued. 


Special call for switching processes during break handling 


void interrupt far handle_break( ) 


- bp & sp after mov bp,sp 
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within the loaders. However, before diving into the loaders, I want to 
look at a process environment. 

A program, when loaded into memory, is a process. Each process 
within DOS-C has a data structure associated with it known as the PSP 
or Program Segment Prefix. It is important that you become familiar 
with this data structure because DOS-C uses it to tie together file sys- 
tem as well as task-related functions. 


Listing 5.10 The exec_user() and hand]le_break() 
functions — continued. 


public _handle_break 
_handle_break proc far 


push ax 

push bx 

push cx 

push dx 

push es 

push ds 

push si 

push di 

mov ax,OGRDUP 
mov ds,ax 
push bp 

mov bp,sp 
assume DS: DGROUP 
cld 


; handler body - start out by restoring stack 
pushf 
cli 


; Save background stack 
mov word ptr DGROUP:_api_ss.ss 
mov word ptr DGROUP:_api_sp.sp 


; restore foreground stack here 
mov ss,word ptr DGROUP:_usr_ss 
mov sp,word ptr DGROUP:_usr_sp 
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DOS-C also has two executable file types typically referred to as 
EXE and COM files. A COM file is a memory image of an executable 
program. DOS-C simply loads this type of program into memory. An 
EXE file, however, is a more complex file consisting of a header fol- 
lowed by the program and possibly a relocation table. It is important 


Listing 5.10 The exec_user() and handle_break() 
functions — continued. 


; get all the user registers back 


pop 
pop 
pop 
pop 
pop 
pop 
pop 
pop 
pop 


bp 
di 
si 
ds 
es 
dx 
cx 
Dx 
ax 


; do the int 23 handler and see if it returns 


int 


23h 


; we're back, must have been users handler 


push 
push 
push 
push 
push 
push 
push 
push 
push 
mov 

mov 

mov 


ax 


ax, DGROUP 
ds,ax 


assume DS: DGROUP 


; test for far return or iret 


cmp 
jz 


sp,_usr_sp 
hbrkl ; it was far ret 
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that you understand this data structure as well because it governs how 
the loader loads the program into memory. 

As you have seen before, the psp data structure (Listing 5.11) is 
another 80x86 processor-dependent data structure. It is designed to fit 
within the first 16 paragraphs. This results in a few unused fill regions 


Listing 5.10 The exec_user() and handle_break() 
functio:.s —-- continued. 


; restart int 21 from the top 
pop bp 

pop. di 

pop si 

pop ds 

pop es 

pop dx 

pop cx 

pop bx 

pop ax 

call _int2l_entry 
iret 


popf ; Clear the flag from the stack 
jnc hbrkl ; user wants to restart 
pop bp 
pop. di 
pop si 
pop. ds 
pop es 
pop dx 
pop cx 
pop bx 
pop ax 
mov ax,4c00h ; exit 
; set break detected flag 
mov byte ptr _break_flg,Offh 
cal] _int21l_entry 
iret 


_handle_break edp 
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Listing 5.11 The psp data structure. 


typedef struct 


{ 


*/ 


UWORD pS_exit; 
UWORD ps_siZe; 
BYTE ps_filll; 


/* CP/M-like entry point 


BYTE ps_farcall: 
VOID thaw hae reer ss 
VOIG (interrupt tAN pslisvc2ou), 


(interrupt FAR *ps_isv23)(), 
(interrupt FAR *ps_isv24)(); 

UWOROD ps_parent: 

UBYTE ps_files(20): 

UWORD ps_environ; 

BYTE FAR *ps_stack; 

WORD =ps_maxfiles; 

UBYTE FAR *ps_filetab; 

BYTE FAR *ps_prevpsp; 

BYTE FAR *ps_dta:; 


BYTE ps_fil)2(16]: 
BYTE ps_unix(3]; 
BYTE ps_fil13(9); 
union 
( 
struct 
( 
fcb 
_Ps_fcbl; 
) _ul; 
struct 
{ 
BYTE 
fi114(16): 
fcb 
_ps_fcb2: 
} _u2; 
Struct 
| 
BYTE fi1115{36); 
struct 
{ 
BYTE _ps_cmd_count; 
BYTE _ps_cmd(127]; 
} _u4; 
} _u3; 
} _u; 


/* 
/* 
/* 


/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 
/* 


/* 


/* 


/* 


CP/M-like exit point 
memory size in paragraphs 
single char fill 


far call opcode 

re-entry point 

terminate address 

break address 

critical error address 
parent psp segment 

file table - Oxff is unused 
environment paragraph 

user stack pointer - int 21 
maximum open files 

open file table pointer 
previous psp pointer 
process dta address 


a 
*/ 
*/ 


*/ 
*/ 
oa 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
*/ 
a/ 
*/ 


unix-style call - Oxcd 0x21 Oxcb 


first command line argument 


second command line argument 


command tai} 


SE 


*/ 


*/ 
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that are placed in the data structure to guarantee alignment and compat- 
ibility with MS-DOS. A few areas also serve multiple purposes. These 
areas result in unions becoming members of the structure. Although it 
may seem complex at first glance, studying each member will help in 
understanding the psp. 

Historically, the psp has its roots in CP/M. The member ps_exit isa 
remnant of this CP/M heritage. It provides a CP/M-like exit point. A 
COM program can terminate by simply performing a jump to address 0. 
Another CP/M-like entry ovint is ps_farcall and ps_reentry(). 
These members are a call to the DOS-C system call dispatcher. A COM 
file may perform a system call with the cal] 5h mechanism that was 
used for CP/M. These two psp entries are now seldom used, but DOS-C 
maintains the structure members for compatibility with older programs. 

Three members, ps_isv22(), ps_isv23(), and ps_isv24(), rep- 
resent the initial process terminate address, break address, and critical 
error address vectors. Before DOS-C starts a process, the interrupt vec- 
tors for 22h, 23h, and 24h are copied into these psp structure members. 
Upon process termination, these vectors are copied back, and the 22h 
vector is executed. Quite often, these vectors are used as hooks into 
special termination processing. 

DOS-C also uses the psp for process-related information. For exam- 
ple, there are system calls to set and get a quantity known as a DTA or 
Disk Transfer Area. Because this is a process-unique entity, it is stored 
within the psp in structure member ps_dta. Another per-process entity 
is ps_s tack. This entry is the user stack pointer on entry into the int 21h 
handler. Other process-related members are ps_size, which holds the 
process memory size in paragraphs, and ps_parent, which contains the 
parent process psp segment. You should note that throughout DOS-C 
the psp segment is used as the process ID. It is this quantity that identi- 
fies the owner of a memory block and is stored in the MCB header. 
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I have previously noted that UNIX influenced MS-DOS in the area of 
file handles and redirection. psp data structure members ps_maxfiles, 
ps_files, and ps_filetab assist in the implementation. Member 
ps_maxfiles is a per-process variable that specifies the maximum 
number of open files permissible for this process. DOS-C uses a table 
for mapping user handles into system handles called ps_files. DOS-C 
also allows a user to change the number of files on a per-process basis. 
Member ps_filetab is a far pointer to the open file table. 

Two ther «-ntries echo other UNIX influences on MS-DOS. Jn MS-DOS 
and DOS-C, each program has an environment that it inherits from its 
parent process. The implementation is a segment of memory aligned 
on a paragraph boundary. The segment for the environment is stored in 
ps_environ. Finally, a UNIX-style system call entry, ps_unix, allows 
a COM program to perform a cal] 50h to dispatch a system call. 

The remainder of the psp is dedicated to two FCBs, _ps_fcbl and 
_ps_fcb2, which are parsed prior to the start of the program. These are 
compatibility entries that were used for command line variables as a 
shortcut to processing the command tail. Of course, the command line 
used to invoke the process less the verb is _u4. These fcbs and the 
command tail are held within the union _u. This union is designed to 
specifically match the psp entries. Aliases for the union/struct com- 
binations are _ps_fcbl and _ps_fcb2 for the two system FCBs and 
_ps_cmd and _ps_cmd_count for the command tail. 

Before becoming a process, a program may be stored on disk two 
ways. The first way is as a COM file, which is an exact binary image of 
the program as it initially appears in memory. The image must reserve 
the first 100h bytes for the psp and must use the tiny or small program- 
ming model. The second way is as an EXE file that is composed of a 
header, image, and an optional relocation table (Listing 5.12). The 
exe_header data structure is the mechanism used to tie the program 
together. 
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The first entry in exe_header is exSignature, which identifies an 
EXE file. This entry is either the ASCII characters “MZ” or “ZM”. The 
next two members indicate the file size in 512-byte pages. The first of 
these two, exExtraBytes, indicates the number of bytes in the last par- 
tial page. The second of these two, exPages, contains the number of 
whole and partial pages in the file. 

Member exRelocItems is a count of the number of relocation 
entries in the relocation table. Each entry is 4 bytes long and is stored in 
80x86 segment:offset fashion. Th- tinal member that determines the 
EXE file size is exHeaderSi Ze, which is the size of the header in 16-byte 
paragraphs. DOS-C uses this member to compute the size of the image 
that it loads into memory. 

Before loading the program into memory, DOS-C attempts to allo- 
cate memory for the program. It does this by computing the image size 
and calculating minimum and maximum memory requirements using 
members exMinAlloc and exMaxAlloc. With memory allocated, the 
image is loaded, and the relocation table is used to fix segment refer- 
ences in the program. These segment references are pointed to by the 


Listing 5.12. The exe_header data structure. 


typedef struct 

{ 
UWORD exSignature; 
UWORD exExtraBytes:; 
UWORD exPages; 
UWORD exRelocItems;: 
UWORD exHeaderSize; 
UWORD exMinAl loc; 
UWORD exMaxAl loc; 
UWORD exInitSS; 
UWORD exInitSP; 
UWORD exCheckSum; 
UWORD exInit IP; 
UWORD exInitCs; 
UWORD exRelocTable: 
UWORD exOverlay: 

} exe_header; 
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entries in the relocation table. DOS-C uses member exRelocTable to 
seek to the start of the relocation table within the file and then adjusts 
each segment reference for exRe] oc! tems iterations. 

With the program loaded into memory and far addresses adjusted 
for its new location, DOS-C uses exInitSS and exInitSP to set the 
initial stack pointer. It initializes each register and begins program exe- 
cution at cS: 1p specified by exInitIP and exInitCS. 

Two final file maintenance entries remain in exe_header. The first 
is exOveriay, whic! ts used fr overlays and specifies the overlay 
number that a file represents. The other is exCheckSum, which is 
designed to contain a checksum that guarantees file integrity. Neither 
MS-DOS nor DOS-C uses this entry. 

Now that I have covered the basic data structures, I can start looking 
at functions that fill these data structures. The DOS-C loaders use these 
functions to simplify the loading of executable programs. 

The first function I will examine is Chi]ldEnv() (Listing 5.13). This 
function creates a new environment for a process if one exists for the 
parent process. The first thing ChildEnv( ) does is compute the length 
of the environment by creating a far pointer and stepping through the 
strings until it finds the terminating null string. Chi ]dEnv() attempts to 
protect the kernel against corrupted environments by testing the envi- 
ronment size previously computed to see if it is greater than the maxi- 
mum permissible environment size. If it is, the environment may be 
corrupted or missing the terminating null string, and ChildEnv() 
aborts without creating a new environment area. 


Listing 5.13 The ChildEnv() function. 


COUNT 
ChildEnv (exec_blk FAR *exp, UWORD *pChildEnvSeg) 
{ 


BYTE FAR *pSrc; 
BYTE FAR *pDest; 
UWORD nEnvSize; 
COUNT RetCode; 
UWORD MaxEnvSize; 
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Listing 5.13 The ChildEnv() function — continued. 


/* create a new environment for the process 
if(exp -> exec.env_seg != 0) 
( 
for(nEnvSize = 0, pSrc = MK_FP(exp -> exec.env_seg,.0); 
*pSrc t= ‘\0': ) 
{ 
while(*pSrc != ‘\0') 
( 
rresre: 
+tnEnvSize, 
} 
/* account for terminating nul) 
++nEnvSize; 
++pSrc; 


} 

else 

( 
nEnvSize = 0; */ 
pChildEnvSeg = 0; 
return SUCCESS: 

} 


/* Test env size and abort if greater than max 
if(nEnvSize >= MAXENV) 
return DE_INVLDENV: 


/* allocate enough space for env + path 
if ((RetCode = DosMemAl loc(long2para(nEnvSize + 65), FIRST_FI1T, 
(seg FAR *)pChildEnvSeg, (UWORD FAR *)MaxEnvSize)) < 0) 
return RetCode: 
else 
pOest = MK_FP(*pChildEnvSeg + 1. 0): 


/* fi11 the new env and inform the process of its 
/* location throught the psp 
for(pSrc = MK_FP(exp -> exec.env_seg,0): *pSrc != ‘\0': ) 
( 
fstrncpy(pDest. pSrc, BUFFERSIZE): 
while(*pSrc != '\0') 
++pSrc; 
while(*pDest != '\0') 
++pDest; 
++pSrc; 
++pDest; 
} 
*pDest = ‘\0'; 
return SUCCESS; 
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Now that you have the environment size and some assurance that it 
is intact, ChildEnv() calls DosMemAl1oc() to allocate enough space 
for the new environment plus the path. If the memory is successfully 
allocated, ChildEnv() proceeds to fill the new environment and inform 
the new process of its location through the pointer pChi 1dEnvSeg. 

Just as the ChildEnv() function deals specifically with the creation 
of a process environment, DOS-C has the function new_psp() (Listing 
5.14) that deals specifically with the creation of a process psp. Similar 
to its environment -‘bii1.. new_psp() is used by the loaders to simplify 
the loading of an executable program into memory. This design is not 
unusual. You have seen this type of design throughout DOS-C. 


Listing 5.14 The new_psp() function. 


VOID 
new_psp (psp FAR *p, int psize) 
( 

REG COUNT i; 

BYTE FAR *1pPspBuf fer; 


/* Clear out new psp first 
for(|1pPspBuffer = p, i = 0; i < sizeof(psp) ; +i) 
*1pPspBuffer = 0; 
/* initialize all entries and exits 
/* CP/M-like exit point 
p -> ps_exit = Ox20cb; 
/* CP/M-like entry point - jump to special entry 
p -> ps_farcall= Oxea; 
iHifndef IPL 
p -> pS_reentry = int2l_entry: 
ftendif 
/* unix style call - Oxcd 0x21 Oxcb (int 21, retf) 
p -> ps_unix(0] = Oxcd; 
p -> ps_unix[(1] = 0x21; 
p -> ps_unix(2] = Oxcb; 


/* Now for parent-child relationships 
/* parent psp segment 

p -> psS_parent = DOS_PSP; 

/* previous psp pointer 

p -> pS_prevpsp = (BYTE FAR *)-1; 
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Listing 5.14 The new_psp() function — continued. 


fHifndef IPL 


/* Environment and memory useage parameters */: 
/* memory size in paragraphs */ 
Pp -> ps_size = psize; 

/* environment paragraph */: 
Pp -> psS_environ = 0; 

/* process dta */ 
D > ps_di> -— (BYTE FAR *)(&p > ps_cma_count;, 

/*™ terminate address */ 
p -> ps_isv22 = int22_handler; 

/* break address */ 
p -> ps_isv23 = int23_handler; 

/* critical error address */ 


p -> ps_isv24 = int24_handler; 
dtendif 


/* File System parameters */ 
/* user stack pointer - int 21 */ 
p -> ps_stack = (BYTE FAR *)-1; 

/* file table - Oxff is unused */ 


for(i = 0; i < 20; it) 
p -> ps_files(Ci] = Oxff; 


/* initialize stdin, stdout, etc */ 
p -> ps_files(STDIN] = 0; /* stdin */ 
p -> ps_files(STDOUT) = 1; /* stdout */ 
p -> ps_files([STDERR] = 2: /* stderr */ 
p -> ps_files(STDAUX] = 3; /* stdaux */ 
p -> ps_files[STDPRN] = 4; /* stdprn */ 
/* maximum open files ne 
p -> ps_maxfiles = 20; 

/* open file table pointer */ 
p -> ps_filetab = p -> ps_files; 

/* first command line argument */ 
p -> ps_fcbl.fcb_drive = 1; 

/* second command line argument */ 
p -> ps_fcb2.fcb_drive = 2; 





/* ]ocal command line 
p -> ps_cmd_count = 0: /* command tail */ 
p -> ps_cmd[0] = 0; /* command tail */ 
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The function new_psp( ) begins by clearing the memory region that 
will become the new psp. This action ensures a known state for mem- 
bers that are not currently used and also for variable-length fields such 
as the command tail. 

The algorithm behind new_psp() is simple: it initializes all entries 
and exits. It does not fill in psp structure members that are process spe- 
cific. In essence, it creates a skeletal psp that is necessary for both the 
COM loader and the EXE loader. 

The first structure member ‘! initializes is the CP/M-like exit point. 
It enters 2 bytes that are an int 20h instruction. This is the direct Ter- 
minate Program entry and guarantees termination whether the program 
performs a call or a jump to this location. It initializes the CP/M-like 
entry point next with a jump to special entry int21_entry. Then it ini- 
tializes the UNIX-style call that isan int 21h instruction followed by 
afar return. This completes the psp system call linkages. 

new_psp() next initializes the parent-child relationships. It fills in 
the parent psp segment with the DOS-C segment, DOS_PSP. This loca- 
tion will be overwritten later, but this initialization guarantees an 
owner. It then initializes the previous psp pointer to -1. This signifies to 
DOS-C that it is the start of the process chain. Again, it will be over- 
written by DOS-C when applicable. 

With process relationship structure members initialized, new_psp( ) 
proceeds to environment and memory usage structure members. It 
begins by initializing the process memory size in paragraphs with the 
parameter psize. Then, it initializes the environment with 0, indicating 
that no environment is assigned at this time. The program loaders will 
fill this in later. Next, new_psp() assigns the process dta to the com- 
mand tail area, finishing initialization of the process memory structure 
member. 

Next, new_psp() initializes the interrupt vector members. It first 
assigns the address of the int22_handler function to the terminate 
member. Then it initializes the break address member with the address of 
the int23_handler function. Finally, the critical error address structure 
member receives the address of the int24_handler function, and the 
user stack pointer member for int 21h system calls is initialized to -1. 
This last one is not a real vector but is grouped this way for simplicity. 
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new_psp() now proceeds to initialize the file system parameters. It 
first initializes the file table. Each entry is assigned Oxff, which indi- 
cates that the corresponding handle is unused or closed. It then initial- 
izes the first five handle entries for stdin, stdout, etc., with the 
corresponding table handles of the fixed system file. new_psp() then 
initializes the maximum open files structure member to the default of 
20 and completes the file assignments by assigning the open file table 
pointer to the psp file table. 

new_psp() complete- tts work by initializing the FCBs of the first 
and second command line arguments and the command tail, then exits. 
This completes the new_psp( ) function. Again, some of these entries 
will be overwritten, but this technique guarantees that the entire psp is 
initialized in one way or another and assists in the maintenance of 
DOS-C so that if any change accidentally omits a psp update, an error 
condition results that may be caught later. 


Program Loaders 


I now will take a look at the program loaders. This is the primary area of 
responsibility for the task loader of any DOS-compatible operating sys- 
tem. The basic methodology is simple. Look at the start of the file, if you 
see a magic number, assume that it is an EXE file; otherwise, it is a 
COM file. Based on this response, dispatch the correct loader. 

The basic entry point for task loading is DosExec() (Listing 5.15). 
Dos Exec( ) begins by checking the validity of the file. First it looks for 
the file by simply trying to open the file with dos_open(). If 
DosExec( ) cannot find the file because dos_open( ) returns an error, it 
assumes that the file does not exist and retums an error. Next it attempts 
to read the header. If the file header cannot be read, DosExec() again 
assumes a nonexistent file and returns an error. If it was successful, 
vosExec() calls dos_close(rc) toclose the file. Now for the acid test: 
if the header structure member exSignature is not MAGIC (“MZ”), then 
it assumes the file is a COM file and executes the COM loader; other- 
wise it assumes it’s an EXE file and executes the EXE loader. 
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Listing 5.15 The DosExec() function. 


COUNT 
DosExec (COUNT mode, exec_blk FAR *ep, BYTE FAR *1p) 
{ 


COUNT rc, nNumRead; 


/* If file not found - free ram and return error 
fHifdef IPL 

printf ("."); 
#endif 

if((rce = dos_open(1p, 0)) < 0) 

return DE_FILENOTFND; 
if((nNumRead = dos_read(rc, (VOID FAR *)&header, 
sizeof(exe_header) ) ) 
'= sizeof (exe_header) ) 


{ 
iHifdef IPL 
printf("Internal IPL error - Read failure (read %d != %d)", 
nNumRead, sizeof(exe_header) ); 
ilendif 
return DE_INVLDDATA; 
J 


dos_close(rc); 
ifdef IPL 

printf("."); 
dtendi f 


if(header.exSignature != MAGIC) 
ifndef IPL 
return DosComLoader(1p, ep, mode); 
ilelse 
( 
char szFileName[32]; 


fmemcpy((BYTE FAR *)szFileName, Ip); 
printf("\nFile: %s (MAGIC = 204x)", szFileName, 
header .exSignature); 

fatal("IPL can't load .com files!"); 

} 

itendif 

else 

return DosExeLoader(1p. ep, mode); 





184 — The FreeDOS Kernel 


The COM loader that DosExec() calls is DosComLoader( ) (Listing 
5.16). DosComLoader( ) is a good function to examine first. By examin- 
ing it, you can concentrate on the steps required to load a file into mem- 
ory without the complexity of the EXE file relocation, placement in 
memory rules, etc. 


Listing 5.16 The DosComloader() function. 


static COUNT 
DosComLoader (BYTE FAR *namep, exec_blk FAR *exp, COUNT mode) 
{ 
COUNT rc, env_size, nread; 
UWORD men; 
UWORD env, asize; 
BYTE FAR *sp, FAR *dp; 
psp FAR *p; 
mcb FAR *mp; 
iregs FAR *irp; 
LONG com_size; 


if(mode != OVERLAY) 
{ 
if((rc = ChildEnv(exp, &env)) != SUCCESS) 
return rc; 
J 
else 
mem = exp -> load.1]0ad_seg; 


/* Allocate our memory and pass back any errors 
if((rc = DosMemAlloc(0, LARGEST, (seg FAR *)&mem, 
(UWORD FAR *)&asize)) < 0) 

{ 

if(env != 0) 

DosMemFree(env); 

return rc; 
} 
else 
{ 

mp = MK_FP(mem, 0); 

++mem; 
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Listing 5.16 The DosComLoader() function — continued. 


/* Now load the executable 
/* If file not found - error 
/* NOTE - this is fatal because we lost it in transit 
/* from DosExec! 
if((rc = dos_open(namep, 0)) < 0) 
fatal("(DosComLoader) com file lost in transit"); 
/* do it in 32K chunks 
sp = MK fo<mem, sizeof(psr)): 
for{com_size = 655361, nread = 0: com_size > 0: ) 
{ 
nread = dos_read(rc, sp, CHUNK); 
sp += nread; 
com_size -= nread; 
if(nread < CHUNK) 
break; 
} 
dos_close(rc); 


if(mode = OVERLAY) 
return SUCCESS; 


/* point to the PSP so we can build it 
p = MK_FP(mem, 0); 
new_psp(p, mem + asize); 


/* clone the file table 
if (InDOS > 0) 
{ 

Psp FAR *q; 

REG COUNT i; 


q = MK_FP(cu_psp, 0); 
for(i = 0; i < 20; itt) 
{ 

REG COUNT ret; 


if(q -> ps_filesCi] != Oxff 

&& ((ret = CloneHandle(q -> ps_files[i])) >= 0)) 
p -> ps_files({i] = ret; 

else 
p -> ps_filesli] = Oxff; 
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As described earlier, a COM file is a simple memory image of the pro- 
gram. It is an exact image of the program at the instant prior to the start of 
execution. The rules are simple: reserve the first 256 bytes for a psp, and 
DOS-C guarantees to give you a stack, guarantees initial register values, 
and starts executing at 100h offset from the beginning of the file. This 
model is virtually identical to the CP/M model and made a good first 
choice for programs migrating from 8-bit to 16-bit environments. 

Although the concept is simple, the algorithm is somewhat more com- 
plex. Some issues must be dealt with when starting a program. It needs to 
systematically allocate resources, make assignments to psp structure 


Listing 5.16 The DosComLoader() function — continued. 


p -> ps_environ = env ~=0?0: env+1; 


/* complete the psp by adding the command line 
p -> ps_cmd_count = exp -> exec.cmd_line -> ctCount; 
fbcopy(exp -> exec.cmd_line -> ctBuffer, p -> ps_cmd, 

p -> ps_cmd_count > 127 ? 127 : p -> ps_cmd_count); 


/* stick a new line on the end for safe measure 
p -> ps_cmd[p -> ps_cmd_count] = '\r': 


/* identify the mcb as this functions’ 
/* use the file name less extension - left adjusted and 
/* space filled 
mp -> m_psp = mem + 1; 
for(asize = 0; asize < B; asize+) 
( 
if(namep[asize] != ‘.') 
mp -> m_name[Lasize] = namep[asize]; 
else 
break; 
) 
for(: asize < B; asize++) 
mp -> m_name[asize]) = ' '; 


/* build the user area on the stack 
irp = MK_FP(mem, (Oxfffe - sizeof(iregs))); 





DOS-C Kernel: Memory Manager and Task Manager — 187 


members, and trap potential errors. Although these issues somewhat 
complicate the task of loading a COM file, the code is straightforward. 
DosComLoader( ) begins by trying to allocate a user environment. It does 
this by calling ChildEnv( ) if it is not loading an overlay. DosComLoader( ) 
checks for errors and returns should ChildEnv() be unable to create the 


Listing 5.16 The DosComLoader() function — continued. 


/* start allocu:iny cis 

irp -> ES = irp -> DS = mem; 

irp -> CS = mem; 

irp - 0x100; 

irp - Oxffff; /* for now, until fcb code is in 
irp - 
irp - 
irp 
irp - 
irp - 
irp - 
irp - 


VVVNMN VY VMN VU VN Vv 


= 0 
FLAGS 


= 0; 


/* Transfer control to the executable 
p -> pS_parent = cu_psp; 
p -> ps_prevpsp = (BYTE FAR *)MK_FP(cu_psp, 0); 
p -> ps_stack = (BYTE FAR *)user_r; 
switch(mode) 
{ 
case LOADNGO: 
cu_psp = mem; 
dta = p -> ps_dta; 
exec_user(irp); 
/* We should never be here 
fata)( "KERNEL RETURNED! !!"); 
break; 


case LOAD: 
exp -> exec.stack = (BYTE FAR *)irp; 
exp -> exec.start_addr = MK_FP(irp -> CS, irp -> IP); 
return SUCCESS; 
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environment. Given that the first memory allocation succeeds, it next tries to 
allocate memory for the process and exits if an error occurs. Note that by 
specification, the memory allocation is for the largest memory block avail- 
able. Also, if DosComLoader‘( ) detects an error condition, it must deallocate 
the memory previously set aside for the environment. This pattern of deallo- 
cating previously set-aside resources continues through DosComLoader( ) 
because as more of the new process environment is built up, more system 
resources are used. An exit from an error condition must systematically 
retum the system reseurces crnersice NOS-C would slowly become use- 
less after repeated errors. 

At this point, DosComLoader() has allocated all the memory it 
needs to load the executable. It again starts out by attempting to open 
the file that contains the program image. If it cannot find the file, it per- 
forms error processing. Note that this time the error is fatal. The reason 
may not be clear until you examine how DOS-C arrived at this point. 
Previously, the file had to exist in order to test for the presence of a 
header. If an error occurs only a single function call later, a more seri- 
ous error, such as the corruption of the user data or a wild function call, 
caused DOS-C to execute this code erroneously. The odds are very 
good that DOS-C cannot recover, so it must terminate. 

Once DosComLoader( ) opens the file, it proceeds to load the image 
into memory in 32Kb chunks. It does this by repeatedly calling 
dos_read() and updating pointers so that the next iteration loads the 
next chunk in the proper location. There are only two ways to exit the 
loop: the first way is to try to read more than 64Kb, and the second way 
is to read less than 32Kb, indicating an end-of-file condition. Regard- 
less of the terminating condition, DosComLoader‘( ) then closes the file 
by calling dos_close(). This is also the exit point for an overlay, since 
all other resources are part of the root program. 

For program loading, however, DosComLoader( ) now must proceed to 
make this image into an executable process. It computes the location of the 
psp so that DosComLoader() can populate it. DosComLoader() calls 
new_psp() to initialize, then proceeds to clone the parent process’ file 
table. DosComLoader() places the true environment segment value into 
the psp and then completes it by moving the command line into the com- 
mand tail of the psp. 
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Next, DosComLoader() has to do some housekeeping. It needs to 
identify the mcb to signify that it belongs to this process. It does this in 
two ways. First, DosComLoader() uses the file name less extension 
after left adjusting it with trailing space. Second, it places the segment 
value of the psp address into the m_psp structure member. It is this 
structure member that DOS-C uses to identify free memory blocks. 

Finally, DosComLoader( ) must pay attention to the needs to the pro- 
cessor itself. It first computes the location of the machine stack and pro- 
ceeds io hyphae -o -..:: on the stack. It builds this ares :y ailocating 
register variables in a manner that complies with the exec_user( ) sup- 
port function. This is how DosComLoader() guarantees the register 
contents. With the machine context initialized, DosComLoader( ) trans- 
fers control to the loaded executable image through exec_user( ). One 
final note, if the function is invoked with a load and go condition, the 
exec_user() call should never return. If it does, you have another 
unrecoverable error and DosComLoader( ) invokes a fatal error. 

You now have a better understanding, after examining DosComLoader‘( ), 
of the basic steps in loading a program into memory and starting the process. 
I will now move on to DosExeLoader( ) (Listing 5.17) toexamine the more 
complete EXE file-loading process. 


Listing 5.17 The DosExeLoader() function. 


static COUNT 
DosExeLoader (BYTE FAR *namep, exec_blk FAR *exp, COUNT mode) 
( 
COUNT rc, env_size, i, nBytesRead; 
UWORD mem, env, asize, start_seg: 
ULONG image_size; 
ULONG image_offset; 
BYTE FAR *sp, FAR * dp; 
psp FAR *p; 
mcb FAR *mp; 
iregs FAR *irp; 
UWORD reloc[2]; 
seg FAR *spot; 
LONG exe_size; 
fHifdef IPL 
BYTE szNameBuffer[64]; 
{fendi f 
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As with DosComLoader(), DosExeLoader( ) first attempts to clone 
the environment and create a memory arena for the new process. Unlike 
DosComLoader( ), it next computes an image offset from the header 
because the image shares its size information within the file with the 
header and optional relocation and debugger information. With this infor- 
mation in hand, DosExeLoader() can now compute image size by add- 
ing the number of pages scaled to bytes to the size of the psp in bytes. 


Listing 5.17 The vos=xeLoader() function — continued. 


#Hifndef IPL 
/* Clone the environement and create a memory arena 
if(mode != OVERLAY) 
( 
if((rc = ChildEnv(exp, &env)) != SUCCESS) 
return rc; 
J 
else 
mem = exp -> load.]oad_seg; 
ifendif 


/* compute image offset from the header 
ifdef IPL 
fscopy(namep, (BYTE FAR *)szNameBuf fer); 
printf("\nEXE loader loading: %s", szNameBuffer ); 
ftendif 
image_offset = (ULONG)header.exHeaderSize * 161; 


/* compute image size by removing the offset from the number 
/* pages scaled to bytes plus the remainder and the psp 
/* First scale the size 
image_size = (ULONG)(header.exPages) * 5121; 
0 
/* remove the offset 
image_size -= image_offset; 
/* add in the remainder bytes 
if(header.exExtraBytes != 0) 
image_size -= (ULONG)(512 - header.exExtraBytes); 
ftendif 
/* and finally add in the psp size 
if(mode != OVERLAY) 
image_size += (ULONG)1long2para((LONG)sizeof(psp)); 
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This is now the minimum size of the memory block necessary to load the 
file. It will be used later to determine the actual memory block size. 

Next, DosExeLoader() attempts to find out how many paragraphs 
are available. It computes the minimum size of the EXE file by using 
the exMinAl loc header structure member. It then follows the rules for 
EXE file loading by attempting to allocate the maximum memory 
requested for the process — the exact amount available if less than the 
maximum requested but no less than the minimum requested. Once 
DosExeloader() has determined the amount of memory to request 
from the memory manager, it allocates the memory and returns if any 
errors occur. 

With a memory block available, DosExeLoader( ) next computes the 
far pointer to the space where the image will be loaded and proceeds to 
load the executable. In a manner similar to DosComLoader ( ), it attempts 


Listing 5.17 The DosExeLoader() function — continued. 


dHifndef IPL 
if(mode != OVERLAY) 
( 
/* Now find out how many paragraphs are available */ 
if((rc = DosMemLargest((seg FAR *)&asize)) != SUCCESS) 
return rc; 
else 
( 
exe_size = (LONG)]ong2para(image_size) + 
header.exMinAlloc + 
long2para((LONG)sizeof (psp) ); 
if(exe_size > asize) 
return DE_NOMEM; 
else if (((LONG)]long2para(image_size) + 
header.exMaxAlloc + 
long2para((LONG)sizeof(psp))) < asize) 
exe_size = (LONG) 1long2para(image_size) + 
header.exMaxAlloc + 
long2para((LONG)sizeof(psp)); 
else 
exe_size = asize; 
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Listing 5.17 The DosExeLoader() function — continued. 


/* Allocate our memory and pass back any errors */ 
/* We can still get an error on first fit if the above */ 
/* returned size was a bet fit case */ 
if((rc = DosMemAlloc((seg)exe_size, FIRST_FIT. 

(seg FAR *)&mem, 

(UWORD FAR *)&asize)) < 0) 


if(rc == DE_NOMFS:- 
{ 
if((rc = DosMemAlloc(0, LARGEST, (seg FAR *)&mem, 
(UWORD FAR *)&asize)) < 0) 
{ 
if(env != 0) 
DosMemFree(env); 
return rc; 
} 
/* This should never happen, but ... 
if(asize < exe_size) 
( 
if(env != 0) 
DosMemFree(env ); 
DosMemFree(mem) : 
return rc; 


} 
else 
( 
if(env != 0) 
DosMemFree(env); 
return rc; 


} 
else 
/* with no error, we got exactly what we asked for 


asize = exe_size; 
J 
ftelse 
mem = KERNSEG;: 
ftendif 
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to open the file and exits with a fatal error if it cannot. The same logic 
that applied to DosComLoader( ) applies to DosExeLoader( ); however, 
there is a major divergence at this point. Where DosComLoader() sim- 
ply started loading from the beginning of the file, Dos ExeLoader( ) per- 
forms a seek operation from the offset computed earlier to the start of 
the image. 

As in DosComLoader( ), DosExeLoader( ) uses a loop to read in the 
image in 32Kb chunks. However, unlike DosComLoader(), its loop exit 
criteria are either reading all wytes in the computed EXE file size or a 


Listing 5.17 The DosExeLoader() function — continued. 


ifndef IPL 
if(mode != OVERLAY) 
{ 
ftendif 
/* memory found large enough - continue processing 
mp = MK_FP(mem, 0); 
++mem; 
Hi fndef IPL 
) 
else 
mem = exp -> load.load_seg; 
ftendi f 


frifdef IPL 
printf(".") 
ftendif 
/* create the start seg for later computations 
if(mode = OVERLAY) 
Start_seg = mem; 
else 
start_seg = mem + long2para((LONG)sizeof(psp)); 


/* Now load the executable 

/* If file not found - error 

/* NOTE - this is fatal because we lost it in transit 

/* from DosExec! 

if((rc = dos_open(namep, 0)) < 0) 
fatal("(DosExeLoader) exe file lost in transit”); 
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Listing 5.17 The DosExeLoader() function — continued. 


fHifdef IPL 
printf("."); 
#tendif 
/* offset to start of image 
if (dos_lseek(rc, image_offset, 0) != image_offset) 
( 
HHifndef IPL 
iF@made }— OVERLAY: 
l 
DosMemFree( --mem) ; 
if(env != 0) 
DosMemFree(env ); 
) 
#tendif 
return DE_INVLDDATA; 
} 


#Hifdef IPL 
printf("."); 
endif 


/* read in the image in 32K chunks 
if(mode != OVERLAY) 
exe_size = image_size - long2para((LONG)sizeof(psp)); 
else 
exe_size = image_Size; 
sp = MK_FP(start_seg. 0x0); 
whi le(exe_size > 0) 
{ 
nBytesRead = dos_read((COUNT)rc, (VOID FAR *)sp, 
(COUNT) (exe_size < CHUNK ? exe_size : CHUNK)); 
sp = add_far((VOID FAR *)sp, (ULONG)CHUNK); 
exe_size -= nBytesRead; 
if(nBytesRead = 0 || exe_size <= 0) 
break; 
ifdef IPL 
printf("."); 
#tendif 
} 
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read error. Also, DosExeLoader( ) calls add_far( ) to update the pointer 
to the next chunk. This call is necessary in order to avoid the pointer 
addition problems discussed earlier in the chapter. DosExeLoader(), 
unlike DosComLoader( ), is not finished with the file when it completes 
the image load. It must next proceed to relocate the image for the new 
segment into which it was loaded. It does this by seeking to the offset 
given in the header structure member exRelocTable and looping the 
header structure member exRelocItems iterations. For each iteration, 


ti calls cus_ reac’? i: read the relocation segment:offset file pointer, 


Listing 5.17 The DosExeLoader() function — continued. 


#Hif 0 
/* Error if we did not read the entire image 
if(exe_size != 0) 
fatal("Broken exe loader (exe_size != 0)"); 
Hendif 


/* relocate the image for new segment 
dos_]lseek(rc, (LONG)header.exRelocTable, 0); 
for (i=0; i < header.exReloclItems; i++) 
( 
if (dos_read(rc, (VOID FAR *)&reloc(0J. sizeof(reloc)) 
'= sizeof(reloc)) 
{ 
return DE_INVLDDATA; 
if(mode = OVERLAY) 
{ 
spot = MK_FP(reloc(1] + mem, reloc(0]); 
*spot = *spot + exp -> load.reloc; 
) 
else 
( 
/* spot = MK_FP(reloc({1] + mem + 0x10, reloc(0]); 
spot = MK_FP(reloc{1] + start_seg, reloc(0]); 
*spot = *spot + start_seg; 
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Listing 5.17 The DosExeLoader() function — continued. 


ifdef IPL 
printf("."); 

itendi f 
/* and finally close the file 
dos_close(rc): 


/* exit here for overlay 
if(mode = OVERLAY) 
return SUCCESS; 


/* poinl to the PSP so we can build it 
p = MK_FP(mem, 0); 
new_psp(p, mem + asize); 


ifndef IPL 
/* clone the file table 
if(InDOS > 0) 
{ 
psp FAR *q; 


q = MK_FP(cu_psp, 0); 
for(i = 0; i < 20; i++) 


REG COUNT ret; 


if(q -> ps_filesCi] != Oxff 

&& ((ret = CloneHandle(q -> ps_files{iJ)) >= 0)) 
p -> ps_files{i] = ret; 

else 
p -> ps_filesCi] = Oxff; 


} 


p -> ps_environ = env = 070: env+1; 
ifelse 

p -> pS_environ = 0; 
#endif 
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computes the corresponding memory location, and adds the segment 
value to the original segment value in memory. 

When DosExeLoader( ) completes the file load and relocation process, 
it closes the file by calling dos_ctose(). As with DosComLoader‘( ), this 
is also the exit point for an overlay. 

In a manner similar to DosComLoader(), DosExeLoader() pro- 
ceeds to make this image into an executable process by computing the 
location of and building the PSP. 


Listing 5.17 The DosExeLoader() function — continued. 


/* complete the psp by adding the command line 
p -> psS_cmd_count = exp -> exec.cmd_line -> ctCount; 
fbcopy(exp -> exec.cmd_line -> ctBuffer, p -> ps_cmd, 

p -> psS_cmd_count > 127 ? 127 : p -> ps_cmd_count); 


/* stick a new line on the end for safe measure 
p -> ps_cmd[p -> ps_cmd_count] = ‘\r'; 


/* identify the mcb as this functions’ 
/* use the file name less extension - left adjusted and 
/* space filled 
mp -> m_psp = mem + 1; 
for(i = 0; i < 8; iH) 
{ 
if(namepCi] != °.') 
mp -> m_name[i] = namepLi]: 
else 
break; 
) 
for(: i < 8; i++) 
mp -> m_name[i] i 


/* build the user area on the stack 
irp = MK_FP(header.exInitSS + start_seg, 
((header.exInitSP - sizeof(iregs)) & Oxffff)); 


fHifdef IPL 
printf(".\n"); 
iendif 
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Listing 5.17 The DosExeLoader() function — continued. 


/* start allocating REGs 

/* Note: must match es & ds memory segment 

irp -> ES = irp -> DS = men; 

irp -> CS = header.exInitCS + start_seg; 

irp -> IP = header.exInitIP; 

irp -> AX = Oxffff; /* for now, until fcb code is in 
#Hifdef IPL 

irp -> BX 

irp -> CX 
#else 

irp -> BX 

irp -> CX 
ftendif 

irp -> DX 

irp -> SI 

irp -> DI 

irp -> BP = 0; 

irp -> FLAGS = 0; 


Booidrive; 
Numt ,oppies; 


/* Transfer control to the executable 
Pp -> pS_parent = cu_psp; 
p -> pS_prevpsp = (BYTE FAR *)MK_FP(cu_psp, 0): 
p -> ps_stack = (BYTE FAR *)user_r; 
#Hifdef IPL 
printf("Starting kernel ...\n"); 
dtendif 
switch(mode ) 


{ 

case LOADNGO: 
cu_psp = mem; 
dta = p -> ps_dta; 
exec_user(irp); 
/* We should never be here 
fatal("KERNEL RETURNED! !!"); 
break; 


#Hifndef IPL 
case LOAD: 
exp -> exec.stack = (BYTE FAR *)irp; 
exp -> exec.Start_addr = MK_FP(irp -> CS, irp -> IP); 
return SUCCESS; 
ifendif 
} 
} 
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DosExeLoader( ) alsocomputes the location of the machine stack and 
proceeds to build the user area on the stack. Unlike DosComLoader‘( ), it 
uses information in the header to do this, whereas DosComLoader ( ) sim- 
ply determined the stack from the process environment it built. It also 
places the machine context on the stack and transfers control to the loaded 
executable image through exec_user( ) using the same logic as that used 
in DosComLoader( ). As you will see next, this common logic is neces- 
sary for an orderly return. 


Returning From a Process 


When a process completes, it must return to the calling process. How- 
ever, it does not retum directly but exits through DOS-C. For this rea- 
son, DOS-C provides a single function, return_user(), to handle 
process termination. After you examine this function, you will better 
understand why the process context for starting a process requires a 
special stack format. You will also later see how this affects the system 
call interface. 

The return_user() function (Listing 5.18) has two basic purposes. 
First, it must systematically tear down the process environment that 
DosComLoader( ) or DosExeLoader() built up. Second, it must recover 
the old process context and restore it. In doing this, DOS-C acts on 
behalf of the terminating process and control simply passes through 
DOS-C. 

The first task return_user() undertakes is to close all files by 
looping through the file table and closing each file individually. Next, 
return_user() proceeds to returm the parent process context. It starts 
this by restoring the parent process interrupt vectors that were previ- 
ously saved in the psp. Next, return_user() begins to free all process 
memory if not a Terminate and Stay Resident (TSR) return. DOS-C 
sets a flag when a process returns through the TSR entry points. This 
flag is checked here to determine if memory resources and the psp 
itself should be discarded. If it is a TSR return, return_user() 
bypasses the retum of memory; otherwise, it returns the environment 
first, then the memory block for the process itself. 


200 — The FreeDOS Kernel 


Finally, return_user( ) reestablishes the parent process as the cur- 
rent process by restoring the process segment value as the current pro- 
cess and restoring the dta to the parent process’ dta. With all this out 
of the way, return_user( ) finally reenters the parent at the point prior 
to the system call that started the process by calling exec_user(). 


Listing 5.18 The return_user() function. 


VOID 

return_user (void) 

{ psp FAR *p, FAR *q; 
REG COUNT i; 


/* Close all files 
for(i = 0; i < 20; iH) 
DosClose(i):; 


/* restore parent 
p = MK_FP(cu_psp, 0): 


/* When process returns - restore the isv 
setvec(0x22, p -> ps_isv22); 
setvec(0x23, p -> ps_isv23); 
setvec(Ox24. p -> ps_isv24); 


/* And free al? process memory if not a TSR return 

if(!tsr) 

{ 
if(p -> ps_environ != (UWORD)O) 

DosMemFree(--p -> pS_environ); 

DosMemFree( - -cu_psp); 

} 

CU_PSp = p -> pS_parent; 

q = MK_FP(cu_psp,0); 

dta = q -> ps_dta; 

exec_user(p -> ps_Stack); 
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DOS-C Kernel: API 


If you are reading this book sequentially, you started looking at DOS-C 
from the bottom up. You studied the lowest levels of the I/O portions of 
the system by looking at the assembly language interface to the device 
drivers. You explored the buffer cache and the character canonical pro- 
cessing. You then discovered the relationships of the lower levels to the 
upper managers: the file system manager, the memory manager, and the 
task manager. 

I am going to break that flow now and approach the system calls 
from the top down. Why the change you may ask? Up until this point, 
the internals were unfamiliar to you and there was no foundation to dis- 
cuss the higher levels. Now you have a solid foundation in place, and 
you can begin building the structure on top by following the specifica- 
tions (Figure 6.1). 
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I am now going to discuss the DOS APIs. Although there are a few, 
I am going to focus ontwo: int 20hand int 21h. These two represent 
the two methods used by programs to communicate with DOS and are 
the most interesting of the system calls. 

Using this approach, the developer who is familiar with the API can eas- 
ily relate the system call to what is going on under the hood. This approach 
also allows the developer who is not familiar with the API an opportunity 


Figure 6.1 DOS API functionality layer. 
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to be guided through it and relate it to the internals. For developers in 
between those levels of familiarity, the following discussions will serve 
both as a refresher and guided tour of the interrelationships. 


System Call Support 


Warning: The code you are about to look at is highly 80x86 depen- 
dent. The reason is quite simple. Any code that deals with an operating 
system call is dependent on the underlying machine architecture. How- 
ever, the methods used are portable to other processors if you general- 
ize the technique. 

Any processor must have some mechanism to make a call. For 
example, a Motorola 680x0 typically uses software exceptions. The 
Intel 80x86 uses two methods. The first method, the call gate, is valid 
only in protected mode. A particular memory location in the form of 
cs: ip is dedicated to the system call by placing an entry into a mem- 
ory-resident table known as the translation lookaside buffer. The built- 
in Memory management unit wakes up when a call is made to this 
address and causes the processor to dispatch the handler for it. 

The second method, the software interrupt, is available in both real 
and protected modes. This method relies on a special instruction known 
as an int instruction. It functions in a way similar to a hardware inter- 
rupt or exception in that it causes a break in the normal flow of the code 
and forces the processor to begin a special piece of code to handle it. 
This instruction is usually followed by a number that indicates what 
vector address is to be used. To use it, you place the address of the han- 
dler into the table location that corresponds to the vector number used. 
You use it as a system call by making assignments to registers, or an 
area of memory, that are specially reserved for the system call, such as 
the stack. You then execute the int instruction, and when it returns, a 
register or memory location contains the results of the system call. 

DOS-C and MS-DOS both use the software interrupt method of sys- 
tem calls of which there are two types. The first type is simply a call 
that takes no parameters. The Terminate call (int 20h) and the Share 
Processor call (int 28h) are representative of this. The second type is 
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the more interesting of the two. This type of call takes both register and 
memory locations as variables and acts on them. In fact, the most inter- 
esting example is the int 21h call, where one register is used to multi- 
plex between a number of system calls. It is this system call that gives 
DOS-compatible operating systems the bulk of their personalities. 

Again, I warn the reader that the code you are about to study is very 
processor dependent. In fact, DOS-C uses the built-in interrupt function 
declarations that are common to 80x86 C compilers in order to simplify 
the design. However, you can use the same technique — by supplying an 
assembly language handler that acts as an interface to the C code — by 
saving the context onto the stack, setting the C call frame, performing the 
call, and then recovering the context before doing an interrupt return. 

DOS-C has two simple system calls to handle program termination. 
One is the Terminate call where the program loses all its resources and 
returns to the parent program. The other is the Terminate and Stay Res- 
ident call where the program again returns to the parent program, but 
all memory resources remain intact. This type of program is the equiva- 
lent of a daemon process that is used in other operating systems. The 
differences between the two are minimal. 

The function int20_handler() (Listing 6.1) is a very simple func- 
tion terminate function. First, it sets the flag tsr to FALSE in order to dif- 
ferentiate between a normal exit and a TSR exit. This is the same flag 
you saw in the task manager in Chapter S. It then calls DosMemCheck( ) 


Listing 6.1 The int20_handler() funtion. 


VOID INRPT far 

int20_handler (int es, int ds, int di, int si, int bp, int sp, 
int bx, int dx, int cx, int ax, int ip. int cs, 
int flags) 


tsr = FALSE; 
DosMemCheck(); 
return_user(); 
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from the memory manager to guarantee that the memory arena is still 
intact since DOS-C cannot function reliably if the memory arena is cor- 
rupted. It finally calls return_user() from the task manager in order 
to terminate the task. 

In the case of this system call, the only function an assembly lan- 
guage interface needs to perform is setting up the C call frame. As you 
can see from the function, none of the machine registers are used. How- 
ever, this is not the case in the function int21_handler(). 

In DOS-C, the int 21h handler is divided into two functions (Fig- 
ure 6.2). The first is int21_handler() (Listing 6.2). This is the func- 
tion whose address is placed into the int 21h vector. It starts by 
incrementing the I[nDOS flag. This flag indicates to DOS-C that it 


Figure 6.2 System call flow. 
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received an interrupt from within the kernel. It causes some functions to 
execute differently in order to guarantee a certain degree of reentrant 
functionality. It next creates a pointer from the system call frame, 
user_r. This pointer is an important structure, iregs, that describes 
the interrupt stack frame and becomes the primary means of communi- 
cation between the kernel and the user functions. All register references 
into and out of the kernel are contained on this stack. It is a f ar pointer 
because the context is contained within the executing program’s stack 
space, again contributing to a reentrant design. With the machine con- 
text in place, int2l1_handler() switches from the user stack to the 
kernel stack by calling set_stack(). At this point, int21_handler() 
has created the kemel environment. It calls int21_service() to dis- 
patch the function call and passes user_r, the pointer to the interrupt 
context, to allow the functions that were dispatched to access and mod- 
ify the user’s registers. When the system call has been completely ser- 
viced, int2l_handler() calls restore_stack() to switch back to the 
user stack, decrements InDOS, and returns. 

One note about this function. Most, if not all, of the 80x86 C com- 
pilers support the special function type interrupt. When a function is 
declared as a type interrupt, the compiler automatically places extra 
code to save all the machine registers. It does this in a particular order. 
To simplify the switching of tasks, DOS-C uses this register order to 


Listing 6.2. The int21_handler() function. 


VOID INRPT far 

int2]_handler (int es, int ds, int di, int si, int bp, int sp, 
int bx, int dx, int cx, int ax, int ip, int cs, 
int flags) 


+1 nD0S; 

user_r = (iregs far *)&es; 
set_stack():; 
int2l_service(user_r); 
restore_stack(); 

--InD0OS; 
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save both user and kernel context, as well as the register context for 
starting a program (Figure 6.2). This allows a single structure definition 
for all process-related functions, including system calls. It also adds 
additional instructions to establish the data section for the function. 
With all this out of the way, it finally creates the stack frame for the C 
call and begins its work. 


Listing 6.3 A sample C interrupt function handler and assembly 
language equivalent. 


int2l_handler (int es, int ds, int di, int si, int bp, int sp, 
int bx, int dx, int cx, int ax, int ip, int cs, 
int flags) 


assume cs:_TEXT 
_int2l_handler proc far 

push ax 

push bx 

push cx 

push dx 

push es 

push ds 

push si 

push di 

push bp 

mov bp,OGROUP 

mov ds,bp 

mov bp,sp 


( 
++1nDOS; 


word ptr DGROUP:_InDOS 
user_r = (iregs far *)&es; 
ax,word ptr [bp] 


word ptr DGROUP:_user_r+2,ss 
word ptr DGROUP:_user_r,ax 
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You will notice when examining the translated statements (Listing 
6.3) that in the assignment of the user context far pointer, user_r, the 
compiler is aware that the registers are passed on the stack and uses ss 


Listing 6.3 A sample C interrupt function handler and assembly 
language equivalent — continued. 


set_stack(): 
call near ptr _set_stack 
int2l_service(user_r); 


word ptr DGROUP:_user_r+2 
word ptr DGROUP:_user_r 
near ptr _int21_service 
Cx 

CX 


restore_stack(); 
near ptr _restore_stack 
--InDOS; 


word ptr DGROUP:_InDOS 


bp 
pop di 
pop si 
pop ds 
pop es 
pop dx 
pop cx 
pop bx 
pop ax 
iret 

_int2l_handler endp 
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for the segment portion of the pointer. This assignment allows the 
stack to be switched in the very next statement with no adverse effects 
on the code. 

The final aspect of int2l1_handler() to note is the exit. Upon ter- 
minating the function, the C compiler generates code to recover the 
context from the stack. This is the reason you needed to recover the 
user stack before returning. Also, the compiler generates an jret 
instead of the normal ret instruction to return from the interrupt. This 
makes the function a full interrupt handler in its own right, without ever 
having to resort to assembly language. These features may not be avail- 
able for your C compiler or platform, but the code presented in Listing 
6.2 should be used as a guide to your own specialized interrupt/system 
call handler. 

Once safely within kernel space, int21_service() is the function 
that actually handles the system call (Listing 6.4). It is arranged as a 
switch statement based on the contents of the ah register. It accesses 
the ah register through the iregs far pointer ras r -> AH. This is 
important to note because all registers are accessed in this fashion and 
are part of the user context as explained earlier. 


Listing 6.4 Anabbreviated int21_service() function. 


VOID int2l_service(iregs far *r) 
[ 

COUNT rc, rcl; 

ULONG Irc; 


/* The dispatch handler 
#ifdef DEBUG 

if (bDumpRegs ) 

( 
fbcopy((VOID FAR *)r, (VOID FAR *)&error_regs, 

sizeof(iregs)): 

printf("System call (21h): %202x\n", r -> AX); 
dump_regs = TRUE; 
dump(); 
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Listing 6.4 An abbreviated int21_service() function — 
continued. 


dendif 
Switch(r -> AH) 
{ 
/* print unused and fall into terminate (debug only) */ 
default: 
#Hifdef DEBUG 
fbcopy((VOID FAR *)r, (VOID FAR *)&error_regs, 
sizeof(iregs)); 
printf ("Invalid system cal) (21h)\n"); 
dump_regs = TRUE; 
dump(); 
# ifdef KDB 
break; 
dt endif 
ftelse 
r -> AX = -rc; 
r -> FLAGS |= FLG_CARRY; 
break; 
dtendif 


/* Terminate Program */ 
case 0x00: 

tsr = FALSE; 

return_mode = break_flg ? 1: 0; 

return_code = r -> AL; 

DosMemCheck(); 

return_user(); 

break; 


/* Read Keyboard with Echo */ 
case Ox0l: 

r -> AL = DosCharInputEcho(); 

break; 


/* Display Character */ 
case 0x02: 

DosDisplayOutput(r -> DL); 

break; 
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The switch has a default handler up front. By specification, the carry 
flag is set on an invalid system call and the default case handles this 
condition. It also has a bit of debugging logic worked in to assist in 
determining DOS compatibility. It is switched in by defining DEBUG 
during compilation. 

The rest of the switch is simple, although the cases are numerous. In 
each case, much of the manipulation of registers is done within the case 


Listing 6.4 An abbreviated int21_service() function — 
continued. 


case Ox0f: 
if(FcbOpen(MK_FP(r -> DS, r -> DX))) 
r-> AL =O; 
else 
r -> AL = Oxff; 
break; 


case 0x10: 
if(FcbClose(MK_FP(r -> DS, r -> DX))) 
r->AL=0: 
else 
r -> AL = Oxff; 
break; 


case Oxl14: 
( 
COUNT nErrorCode; 


if(FcbRead(MK_FP(r -> DS, r -> OX), &nErrorCode) ) 
r -> AL = 0; 

else 
r -> AL = nErrorCode; 

break; 
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itself. The case sets up either to complete the entire call itself, such as 
in the CP/M compatibility functions 0x18, 0xld, Oxle, and 0x20, or to 
make a function call to a handler for that system call. Sometimes it 
requires setting up local variables, at which time the case is a com- 
pound statement. In almost all cases, the case exits through a break but 
not until it performs some error processing. Unfortunately, the error 
processing must be done within the case because MS-DOS has differ- 
ent error returns for groups of system calls. 


Listing 6.4 Anabbreviated int21_service() function — 
continued. 


/* CP/M compatibility functions 


case Ox1B: 

case Oxld: 

case Oxle: 

case 0x20: 
r-> AL=0; 
break; 


/* Parse File Name 
case 0x29: 
( 


BYTE FAR *1pFileName; 


]pFileName = MK_FP(r -> DS, r -> SI); 
r -> AL = FcbParseFname(r -> AL, 
&lpFileName, 
MK_FP(r -> ES, r -> DI)); 

-> DS = FP_SEG(1pFileName) ; 
r -> SI] = FP_OFF(1pFileName) ; 
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Examine a few of the system calls. Perhaps the simplest function 
call that dispatches into the kerel is Read Keyboard with Echo, func- 
tion 0x01. In this function call, the al register returns a character 
entered at the keyboard after echoing it to the display. DOS-C provides 
the function DosCharInputEcho() to perform this function. The ser- 
vice code for the system call is a simple call to DosCharInputEcho( ) 
and an assignment of the return to r -> AL. Because no error is 
returned by this system call, it just terminates with a break to end sys- 
tem call processing. 

Another example with more functionality is the Display Character 
call, function 0x02. For this function, the character to be displayed is 
passed in the d] register. To provide this functionality, DOS-C provides 
the function call DosDisplayOutput(), which takes a single parame- 
ter: the character to be displayed. The call is invoked with the argument 
r -> DL inorder to retrieve the character from the user’s d1 register. 


Listing 6.4 Anabbreviated int21_service() function — 
continued. 


/* Dos Open 


case O0x3d: 
if((rc = DosOpen(MK_FP(r -> DS, r -> DX), r -> AL)) < 0) 
{ 


r -> AX = -rc; 
-> FLAGS |= FLG_CARRY; 


-> AX = rc; 
-> FLAGS & ~FLG_CARRY: 
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Listing 6.4 Anabbreviated int21_service() function — 
continued. 


/* Dos Close 
case Ox3e: 
if((rc = DosClose(r -> BX)) < 0) 
{ 
r-> AX = -rec; 
r -> FLAGS |= FLG_CARRY; 
} 
else 
r -> FLAGS & ~FLG_CARRY; 
break; 


/* Dos Read 


case Ox3f: 
rc = DosRead(r -> BX, r -> CX, MK_FP(r -> DOS, r -> DX), 
(COUNT FAR *)&rcl); 
if(rcl != SUCCESS) 
( 
ce 
r->A 


FLAGS |= FLG_CARRY; 
X= 


-rel; 


r -> FLAGS & ~FLG_CARRY; 
r-> AX = rec; 

} 

break; 


ifdef DEBUG 
i f(bDumpRegs ) 
( 
printf("Exiting system call (21h)\n"); 
fbcopy((VOID FAR *)r, (VOID FAR *)&error_regs, 
sizeof(iregs)); 
dump_regs = TRUE; 
dump(); 
printf("---\n"); 
J 
dtendif 
J 
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A somewhat more complicated system call is Parse File Name, func- 
tion 0x29. In this function, the filename to be parsed is at the address 
given by ds:si in the user registers, and the parsing is controlled by a 
flag in the al register. The function returns the next byte in the user’s ds 
and si registers, after the parse terminates, and a result flag inthe al reg- 
ister. In this example, you have local storage in the form of 1 pFileName 
and data transfers between internal functions and user registers. This 
type of function is much more common throughout DOS-C. 

Our final example requires looking at two function calls, Dos Open, 
function 0x3d, and FCB Open, function 0x0f. These functions illus- 
trate the logic behind two of the error retums in DOS-C. Both functions 
retrieve their arguments from user registers in a manner already seen. 
However, both functions have error returns. For FcbOpen(), a false 
return indicates an error. However, the system call returns an error con- 
dition in al. In order to accommodate, the case examines the retum 
from the function call and assigns the correct value using an 
if...else construct. However, the function DosOpen() returns not 
only an error condition by returning a negative number, but returns a 
handle, which is a positive integer for successful completion. The spec- 
ifications for this function call require that ax contains either the handle 
or an error number. The carry flag is checked to detect an error. This 
technique is much more common for the later system calls. Again, the 
case makes a call to DosOpen() and uses an if...e]Se construct to 
build the correct retum. However, there are several differences between 
FcbOpen() and DosOpen( ). The DosOpen( ) test within the if statement 
is for negative numbers. Also, the returned value is always in the ax 
register and accessed through r -> AX, although the sign of the error 
must be changed. Additionally, the carry flag is definitively set or 
cleared. 

Although these are the highlights of int21_service(), the system 
calls examined provide insight into the steps required to convert an 
int 21h call into a C function and vice versa (Figure 6.2). Unfortu- 
nately, each system call must be handled in a unique manner because of 
the way MS-DOS handles each call. But the overhead is minimal and 
the resulting compatibility far outweighs the headaches. Other system 
calls are a variation of the two examined here and you are encouraged 
to study them and expand on them for your own applications. 
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DOS-C Personality Layer 


If you have been following the system calls, you will notice that they 
call functions that you have not yet encountered. The reason is that 
there are functionality layers between the core DOS-C functions and 
the system calls (Figure 6.3). These functionality layers fall into three 
categories: simple I/O, FCB calls, and DOS calls. I will examine each 
of these three and study the relationships between the system calls and 
the core DOS-C functions. 


Figure 6.3. FCBand DOS call functionality layers. 
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Simple I/O Support 


Start by looking at the simple I/O functions. A number of them support 
the console, an auxiliary device, and a printer. These are direct calls 
down to either the device drivers or a DOS support function. They may 
not need translation and are sometimes simply links from the int 21h 
system call to the device driver. 

The function DosChar InputEcho( ) (Listing 6.5) is an example of sim- 
ple VO functions. It builds a request in the local vaniable rq, calls execrh(), 
and handles errors through char_error() should any occur. It then 
exits but not before calling DosCharOutput( ) to echo the character. 

The function DosDisplayOutput() (Listing 6.6) is another exam- 
ple, although somewhat more complex. This function does some output 
formatting. It checks for a return, delete, backspace, and horizontal tab. 
In each case except tab, it updates internal screen parameters then calls 
DosCharQutput( ) to output the character to the screen. Tab repeatedly 
calls DosCharOutput( ) to output a space until the screen position vari- 
able scr_pos is a multiple of eight. This algorithm is how DOS-C per- 
forms tab expansion. 


Listing 6.5 The DosCharInputEcho() function. 


int DosCharInputEcho(VOID) 
{ 

BYTE cb; 

request rq; 


rq.r_length = sizeof(request); 
rq.r_command = C_INPUT; 
rq.r_count = 1; 
rqg.r_trans = (VOID FAR *)&cb; 
rq.r_status = 0; 
execrh((request FAR *)&rq, syscon); 
if(rq.r_status & S_ERROR) 

return char_error(&rq, con_name); 
DosCharOutput (cb); 
return cb; 
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There are other simple I/O functions, but by now you should under- 
stand how these functions work. I urge you to look at the file 
inthnd1r.c to further study these functions. 

























Listing 6.6 The DosDisplayOutput() function. 


VOID DosDisplayOutput(COUNT c) 
{ 


/* Non-portable construct */ 
if(c < ‘' ° [| c = Ox7f) 
( 
switch(c) 
( 
case ‘\r': 
scr_pos = 0; 


break; 





case Ox/7f: 
++Scr_pos; 
break; 


case ‘\b’: 
if(scr_pos > 0) 

--SCr_pos; 
break; 


case ‘\t': 
do 

DosCharOutput(' '); 
while(scr_pos & 7); 

return; 





default: 
break; 

racenaaieouelSt 

ne 

| DosCharOutput(c); 
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DOS Function Support 


The next logical area to examine is the DOS Function Support, which 
includes functions that give DOS system calls, such as Make Directory, 
Open, etc., their functionality. These system calls typically work with a 
structure known as a system file table (SFT). Each entry is defined as 
the data structure sft in DOS-C. 

The sft data structure (Listing 6.7) is key to both DOS and File 
Control Block (FCB) support functions. Although DOS-C does not use 


Listing 6.7 The sft data structure. 


typedef struct 


{ 
WORD 
WORD 
BYTE 
WORD 
union 


struct 


struct dhdr FAR * 


sft_count; 
sft_mode; 
sft_attrib; 
sft_flags; 


_sft_dcb; 


_sft_dev; 


) sft_dcb_or_dev; 


WORD 
time 
date 
LONG 
LONG 
WORD 
WORD 
WORD 
BYTE 
BYTE 
BYTE 


WORD 

WORD 

WORD 

WORD 
] sft; 


sft_stclust; 
sft_time; 
sft_date; 
sft_size; 
sft_posit; 
sft_relclust; 
sft_cuclust; 
sft_dirdlust; 
sft_diridx; 
sft_name(11]; 
FAR * 
sft_bshare; 
sft_mach; 
sft_psp; 
sft_shrof f; 
sft_status; 


dpb FAR 


reference count 

open mode - see below 

file attribute - dir style 
flags - see below 


The device control block 
device driver for char dev 


Starting cluster 

File time 

File date 

File size 

Current file position 
File relative cluster 
File current cluster 
Sector containing cluster 
directory index 

dir style file name 


backward link of file sharing sft 
machine number - network apps 
owner psp 

Sharing offset 

this sft status 
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sft for file operations, it maintains it for compatibility with MS-DOS. 
Although sft is well known and used by many DOS applications, it is 
one of the “undocumented” data structures. It is useful to explain the 
SFT to better understand the DOS and FCB support functions. 

There is one entry for each open file in the system. This is the same 
as MS-DOS. Where DOS-C differs from MS-DOS is that it links an 
f_node to each SFT entry. This allows us to use the file system man- 
ager in place of DOS-specific file system code. 

Review the sft so you better understand how to use it within the 
interface level. Each structure starts with the member sft_count, a ref- 
erence count. This entry represents the number of file handles that refer 
to the file. In DOS-C, you maintain this entry and declare the SFT entry 
free when sft_count contains 0. This is the only real structure mem- 
ber that is used for SFT maintenance. The rest of the structure is 
devoted to the file. 

The first of the file-related members, s f t_mode, contains the mode the 
file was opened in (e.g., read, write, etc.). It is identical to the mode word 
passed as an argument to the DosOpen( ) system call (int 21h/function 
3dh). The attribute of the file, as stored in the directory, is stored in the 
structure member sft_attrib. In DOS-C this entry is updated for appli- 
cation purposes only because the real file attribute is derived from the 
f_node. Also associated with file functionality is sft_flags, which 
holds a set of flags that determine the attributes of the open file. If the file 
is a character device, this member determines whether it is a console or 
null device, or whether the file should retum an EOF condition. If it is a 
block device, it contains information such as the block device number and 
network file or drive information. 

Following sft_flags is sft_dcb_or_dev. This union holds the 
far pointer _sft_dcb, which points to the device control block, and 
the far pointer _sft_dev, which points to the device driver for a char- 
acter device. DOS-C uses this member whenever the file is a character 
device. For block devices, it maintains the field but uses the f_node 
information instead. Again, although DOS-C does not actively use an 
entry, it is important that the entry is maintained for compatibility. 
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The same is true for anumber of file-related entries in the sft. These 
members, which I will briefly discuss next, are not used by DOS-C but 
are maintained for compatibility. They are sft_stclust, sft_time, 
sft_date, sft_size, sft_posit, sft_relclust, sft_cuclust, 
sft_dirdlust, sft_diridx, and sft_name. These members all have a 
corresponding entry within the f_node. 

The remaining members are used for remote and shared applica- 
tions. Both members sft_shroff and sft_bshare are used by 
share.exe for file-sharing purposes. sft_bshare contains the back- 
ward link of the file-sharing SFT, and sft_shroff holds the sharing 
offset. The remaining entries, with the exception of sft_status and 
sft_psp, deal with network files. The member sft_mach is used to 
hold the machine number for network applications. 

In order to identify the owner of the file, sft_psp contains the Pro- 
gram Segment Prefix (PSP) of the owner. This field is actively used by 
DOS-C for the same purposes. Finally, sft_status contains the status 
of this SFT. In the case of DOS-C, it is actually the index of the corre- 
sponding f_node in the f_node table and is the link between the sft 
and the f_node. 

With the knowledge of the underlying sft data structure, now look 
at the DOS personality functions. The first function you will look at is 
DosOpen( ). It is representative of the personality functions that allocate 
an SFT. The method you will use is the same as you have used before: 
follow a typical sequence of open, read, and close and look at the 
details along the way. 

DosOpen( ) (Listing 6.8) starts by checking for valid arguments. As 
you have seen before, this is a necessary step to prevent the propagation 
of an error that may crash the system. For this function, the only test is 
a check to see if the mode argument is within range. 
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Next, DosOpen() proceeds to open the file. First it must get a free 
handle. It checks the return to make certain that the handle was actually 
allocated. If not, it returns an error. Once it gets a handle, it has to get 
an SFT entry, again checking for an error. 

Once resources have been allocated, DosOpen() must check for a 
device. The way DOS-C handles the file is dependent on whether or 
not the name specifies a disk file or a device. If itis a device, the sft is 


Listing 6.8 The DosOpen() function. 


COUNT 
DosOpen (BYTE FAR *fname, COUNT mode) 
( 
psp FAR *p = MK_FP(cu_psp,0); 
WORD hndl, sft_idx; 
sft FAR *sftp; 
struct dhdr FAR *dhp; 
BYTE FAR *froot; 
BYTE buffCFNAME_SIZE+FEXT_SIZEJ; 
WORD i; 


/* test if mode is in range 
if((mode & SFT_OMASK) != 0) 
return DE_INVLDACC; 


/* get a free handle 
if((hndl = get_free_hnd1l()) = Oxff) 
return DE_TOOMANY:; 


/* now get a free system file table entry */ 
if ((sftp = get_free_sft((WORD FAR *)&sft_idx)) —= (sft FAR *)-1) 
return DE_TOOMANY ; 


/* check for a device */ 
froot = get_root( fname): 
for(i = 0; i < FNAME_SIZE; i++) 
{ 
if(*froot != ‘\0' && *froot != ‘.') 
buffCi] = *froott+; 
else 
break; 
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Listing 6.8 The DosOpen() function — continued. 


for( ; i < FNAME_SIZE; i++) 
bufffiJ=''; 


/* if we have an extension, can’t be a device 
if(*froot !='."'); 
{ 
for(dhp = (struct dhdr FAR *)&nul_dev; dhp != 
(struct dhdr FAR *)-1; dhp = dhp -> dh_next) 
( 
if (fnmatch((BYTE FAR *)buff, (BYTE FAR *)dhp -> 
dh_name, FNAME_SIZE, FALSE)) 
{ 
sftp -> sft_count += 1; 
sftp -> sft_mode = mode; 
sftp -> sft_attrib = 0; 
sftp -> sft_flags = 
(dhp -> dh_attr & ~SFT_MASK) | SFT_FDEVICE | 
SFT_FEOF; 
sftp -> sft_psp = cu_psp; 
fbcopy((BYTE FAR *)buff, sftp -> sft_name, 
FNAME_SIZE+FEXT_SIZE); 
sftp -> sft_dev = dhp; 
p -> ps_filetab(hnd1] = sft_idx; 
return hndl: 


) 
} 
sftp -> sft_status = dos_open(fname, mode); 
if(sftp -> sft_status >= 0) 
{ 
p -> ps_filetab[hnd]] = sft_idx; 
sftp -> sft_count += 1; 
sftp -> sft_mode = mode; 
sftp -> sft_attrib = 0; 
sftp -> sft_flags = 0; 
sftp -> sft_psp = Cu_psp; 
fbcopy((BYTE FAR *)buff, sftp -> sft_name, 
FNAME_SIZE+FEXT_SIZE); 
return hndl; 
J 
else 
return sftp -> sft_status; 
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initialized and the member sf t_f1ags is set to indicate this. Also, the 
member sft_dev is initialized to point to the character device driver. 
This is the switch between character and file types. Device indepen- 
dence is also achieved by this, since the entry and all following func- 
tions deal with the driver in an identical fashion. 

If the name did not specify a device, DosOpen() proceeds to call 
dos_open( ) to open the disk file. The return from this call is stored in 
sft_status and tested for an error. If no error occurred, this entry is now 
the link between the sft and the f__node. For a successful disk file open, 
the sft is initialized, and the function returns the handle for this open. 

Once a file is opened you can proceed to read from it. The DOS per- 
sonality function that supports the system call is DosRead( ) (Listing 
6.9). This function handles both device and file access exactly the same 
way. For every call, DosRead() tries to read the number of bytes 
requested. It will continue to read bytes for every subsequent call until 
an EOF condition occurs, at which time it returns O bytes read. 


Listing 6.9 The DosRead() function. 


UCOUNT 
DosRead (COUNT hnd1, UCOUNT n, BYTE FAR *bp, COUNT FAR *err) 
{ 

sft FAR *s; 

WORD sys_idx; 

sfttbl FAR *sp; 

UCOUNT ReadCount ; 


/* Test that the handle is valid 
if(hndl < 0) 
{ 
*xerr = DE_INVLOHNDL: 
return 0; 
} 


/* Get the SFT block that contains the SFT 
if((s = get_sft(hndl)) = (sft FAR *)-1) 
{ 

*err = DE_INVLDHNDL; 

return 0; 
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The algorithm for DosRead{( ) starts in a way similar to other DOS-C 
system calls. It begins by validating the arguments passed to it. In this 
case, it verifies that the handle passed by the user is valid or at least 
within range. Next, it gets the SFT block that corresponds to the handle 
and checks if it is valid. If not, it returns an error indicating an invalid 
handle. The final check is for permissions. If the file is not open or does 
not have write permission, DosRead( ) exits and retumms an error code 
indicating that this was an invalid file access. 

Next it checks structure member sft_flags to see if the sft refers 
to a device. If the sft does refer to a file, DosRead( ) proceeds to doa 
device read. DosRead( ) first tests for end-of-file and exits immediately 
if it is. This prevents DosRead( ) from going any further if the device is 
no longer open or cannot produce more data. Next it handles raw 
(binary) and cooked (command line edit) modes. It examines the struc- 
ture member sft_flags to see if SFT_FBINARY is set. This flag differ- 
entiates between the two modes. If it is set, DosRead( ) performs a direct 


Listing 6.9 The DosRead() function — continued. 


/* If not open or write permission - exit */ 


if(s -> sft_count == 0 || (s -> sft_mode & SFT_MWRITE)) 
| 

*xerr = DE_INVLDACC; 

return 0; 
| 


/* Do a device read if device */ 
if(s -> sft_flags & SFT_FDEVICE) 
{ 

request rq; 


/* First test for eof and exit immediately if it is */ 
if(!(s -> sft_flags & SFT_FEOF) || (s -> sft_flags & SFT_FNUL)) 
( 

s -> sft_flags &= ~SFT_FEOF; 

*xerr = SUCCESS; 

return 0; 
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driver call that attempts to read the number of requested bytes. On return 
from the device driver call, DosRead() examines the request packet 
for errors and invokes the error handler if an error occurred. If no 
error occurred, it returns the number of bytes the driver actually read. 
DosRead() gets this from the request header structure member 
r_count. In this way, the device driver controls the transfer. This is an 
excellent way to perform direct reads and writes to a device. 


Listing 6.9 The DosRead() function — continued. 


/* Now handle raw and cooked modes 


if(s -> sft_flags & SFT_FBINARY) 
( 
rq.r_length = sizeof(request); 
rq.r_command = C_INPUT; 
rq.r_count = n; 
rq.r_trans = (BYTE FAR *)bp; 
rq.r_status = 0; 
execrh((request FAR *)&rq, s -> sft_dev); 
if(rq.r_status & S_ERROR) 
{ 
REG i; 
BYTE buf f[{FNAME_SIZE+1]; 


fbcopy(s -> sft_name, (BYTE FAR *)buff, 
FNAME_SIZE); 
buf f([FNAME_SIZE+1] = 0; 
for(i = FNAME_SIZE; i > 0; i--) 
if(bufffi] =" ') 
buff{i] = 0; 
else 
break; 
char_error(&rq, buff); 
} 
else 
( 
*err = SUCCESS; 
return rq.r_count; 
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If SFT_FBINARY is not set, DosRead() proceeds to read from the 
device in a “cooked” mode, i.e., the characters are processed for break 
detection, flow control is enabled, etc. There are two possible ways to get 
the data in this mode. The first is an attempt to read directly from the 
standard input device. DosRead( ) examines the sft structure member 
sft_flags to see if SFT_FSTDIN is set. If it is, it calls st i( ) to complete 





























Listing 6.9 The DosRead() function — continued. 


else if(s -> sft_flags & SFT_FSTDIN) 
{ 
if(!check_break() ) 
( 
kb_buf.kb_size = LINESIZE - 1; 
kb_buf.kb_count = 0; 
sti((keyboard FAR *)&kb_buf); 
fbcopy((BYTE FAR *)kb_buf.kb_buf, bp, 
kb_buf.kb_count); 
*err = SUCCESS; 
return kb_buf.kb_count; 
} 
else 
( 
*err = SUCCESS; 
return 0; 


) 

else 

( 
if(!check_break()) 
( 


*bp = _sti(): 
*err = SUCCESS; 
return 1; 


} 
else 
( 
*err = SUCCESS; 
return 0; 
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the read and returns the number of bytes read. If not, it calls the device 
driver support function and exits with the number of bytes read. The sec- 
ond way to get the data in a “cooked” mode is as an attempt to read from 
a device driver and _sti() is called instead. 

When it is not a device driver, DosOpen( ) relies on rdwrblock() to 
perform the file read, returning the count or an error condition if an 
error occurs. In this context, DosOpen( ) is simply a pass through to the 
fs support function call that was covered in detail in Chapter 4. 

When you are through with the file, you must close it and return 
resources back to DOS-C. DosClose() is the DOS personality function 
that implements the Close system call. It performs in a fashion similar 
to other calls you have seen. 


Listing 6.9 The DosRead() function — continued. 


else /* a block read 
{ 
if (!check_break( )) 
( 
COUNT rc; 


ReadCount = rdwrblock(s -> sft_status, bp, n, 
XFR_READ, &rc); 
if(rc != SUCCESS) 
{ 
*err = rc; 
return 0; 
} 
else 
| 
*xerr = SUCCESS: 
return ReadCount; 


| 
else 
{ 
*xerr = SUCCESS; 
return 0; 
} 
} 
*xerr = SUCCESS; 
return 0; 





DOS-C Kernel: API — 229 


As you have seen before, DosClose() (Listing 6.10) first verifies 
that the handle is valid or within range. Next, it gets the SFT block that 
corresponds to the handle. If it is not a valid handle, it returns a DOS 
error indicating that the handle is invalid. In order to close the file, the 


Listing 6.10 The DosClose() function. 


COUNT 

DosClose (COUNT hnd1) 

{ 
psp FAR *p = MK_FP(cu_psp.0): 
sft FAR *s; 


/* Test that the handle is valid 
if(nndl < 0) 
return DE_INVLDHNOL; 


/* Get the SFT block that contains the SFT 
if((s = get_sft(hnd])) = (sft FAR *)-1) 
return DE_INVLDHNOL; 


/* If this is not opened another error*/ 
if(s -> sft_count — 0) 
return DE_ACCESS; 


/* now just drop the count if a device, else 
/* call file system handler 
if(s -> sft_flags & SFT_FDEVICE) 
( 
p -> ps_filetab[hnd]] = Oxff; 
s -> sft_count -= 1: 
return SUCCESS; 
J 
else 
{ 
p -> ps_filetab[hnd1] 
s -> sft_count -= 1; 
if(s -> sft_count > 0) 
return SUCCESS; 
else 
return dos_close(s -> sft_status); 
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final test verifies that the file is open, so that you do not incorrectly 
retum resources that were not allocated, potentially crashing DOS-C. 
So, if the file is not open, you retum an invalid access error. 

Now examine whether or not the file is a device. If it is, DosClose() 
just decrements the SFT count. This allows devices to be opened and 
shared by duplicating the handle using the Duplicate system call. Oth- 
erwise, it calls the file system handler to close the disk file. 

This function completes the tour of the DOS personality functions. 
This gives you a good idea of how to give a system call a particular fla- 
vor. I will now look at the other DOS-C personality function set. 


FCB Function Support 


I will now examine the File Control Block or FCB function calls. 
These calls provide the old CP/M-style FCB function call personality. 
Their design is similar to the DOS function calls. Both are layers that 
perform the direct function of the matching system call. Also, both cre- 
ate a translation service into the three managers: file system, task, and 
memory. 

The FCB function calls also rely heavily on a kernel data structure. 
The difference is that this structure does not exist solely within the ker- 
nel, but is resident in user space. This places some additional restric- 
tions on DOS-C. All designs of FCB support functions must allow for 
users to modify the FCB or complete an operation without a logical ter- 
mination, i.e., open a file but never close it. 

The FCB is a well-known MS-DOS data structure. It is inherited from 
its CP/M ancestor and retains many fields that are similar to its CP/M 
counterpart. It also forms the basis for the DOS-C fcb data structure 
(Listing 6.11). It is important that this data structure closely matches 
the MS-DOS version because it is the only means of communicating 
file-related information for the FCB system calls. 

Structure member fcb_drive contains the drive number for the file. 
By convention, the drives are mapped as 0 = default, 1 = A, etc. The file- 
Name is contained in two members: fcb_fname and fcb_fext. These 
are left-aligned, space-filled representations of the traditional DOS 
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file.ext filename. The next structure member, f cb_cublock, contains 
the current file position represented as a block number where the block is 
defined as 128 records. It is mainly used for sequential read/write opera- 
tions. In order to further refine the current file position, the next structure 
member, f cb_recsiZ, specifies the logical record size in bytes. 

Starting with the next member, the fcb starts to record file and direc- 
tory information. The member f cb_fsize contains the file size in bytes 
that is initialized when a file is opened or created. DOS-C updates this 
member with each write operation. File date and time are recorded in 
structure members fcb_date and fcb_time. 

By convention, DOS-C reserves an area of the FCB in order to meet 
the requirement that users may modify the FCB or complete an opera- 
tion without a logical termination. The next group of structure members 
are reserved for DOS-C. The first member in the reserved group is 
fcb_sftno. This member is used to map the FCB to an internal SFT 
structure. By starting at fcb_sftno, DOS-C can follow the links down 
into the f_node structure used by the file system manager. For a device, 
the next two members, fcb_attrib_hi and fcb_attrib_lo form an 
attribute mode for files and devices. 


Listing 6.11 The fcb data structure. 


/* File Control Block (FCB) 
typedef struct 
( 
BYTE fcb_drive: /* Drive number O=default, 1=A, etc 
BYTE fcb_fname({FNAME_SIZE]: /* File name 
BYTE fcb_fext(FEXT_SIZE]; /* File name Extension 
UWORO fcb_cublock: /* Current block number of 
/* 128 records/block. for seq. r/w 
UWORD fcb_recsiz: /* Logical record size in bytes. default = 128 
ULONG fcb_fsize: /* File size in bytes 
date fcb_date: /* Date file created 
time fcb_time: /* Time of last write 
/* the following are reserved by system 
BYTE  fcb_sftno: /* Device ID 
BYTE fcb_attrib_hi; /* share info, dev attrib word hi 
BYTE fcb_attrib_lo; /* dev attrib word 1o. open mode 
UWORO fcb_strtclst; /* file starting cluster 
UWORD fcb_dirclst; /* cluster of the dir entry 
UBYTE fcb_diroff: /* offset of the dir entry end reserved 
UBYTE fcb_curec: /* Current block number of 
ULONG fcb_rndm; /* Current relative record number 
} feb: 
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When the device is a file, fcb_strtclst records the file’s starting 
cluster and fcb_dirclst specifies the cluster that contains the file’s 
directory entry. In order to easily locate the directory entry, structure 
member fcb_diroff contains the offset of the directory entry itself. 
When combined with fcb_dirclst, a directory entry is fully qualified. 

By MS-DOS design, the remaining structure entries are user mem- 
bers that are used in random read, write, and seek operations. The first 
of these members is fcb_curec, which contains the current block num- 
ber. Again to fully specify a record, the structure member fcb_rndm 
contains the current relative record number of the random operation. 

As I have done before, I will study the FCB personality functions by 
studying the steps needed to open, read, and close a file. The first func- 
tion I will look at is FcbOpen(). This function takes a single far 
pointer that points to the FCB in the users memory area as does virtu- 
ally all of the FCB functions. 

FcbOpen() (Listing 6.12) begins by attempting to get a free SFT 
entry. In DOS-C, all calls use an SFT entry to map the FCB in user 
space into a kernel data structure. Should FcbOpen( ) be unable to get 


Listing 6.12 The FcbOpen() function. 


BOOL FcbOpen(1pXfcb) 
xfcb FAR *1pXfcb; 
{ 
WORD sft_idx; 
sft FAR *sftp; 
struct dhdr FAR *dhp; 
BYTE buf f(FNAME_SIZE+FEXT_SIZE + 3]; 
fcb FAR *1pFcb; 
COUNT FcbDrive; 


/* get a free system file table entry */ 
if((sftp = FcbGetFreeSft( (WORD FAR *)&sft_idx)) = (sft FAR *)-1) 
return DE_TOOMANY ; 


/* Build a traditional DOS file name */ 
IpFcb = CommonFcbInit(1pxfcb, buff, &FcbDrive); 
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an SFT entry, it returns a DOS error indicating that too many files are 
open. Next, FcbOpen() converts the XFCB entry (an extended FCB) 
into an FCB and extracts the filename from the FCB into a local array. 
It builds a traditional DOS filename in the form of a:file.ext that is 
understood by the support functions. 


Listing 6.12 The FcbOpen() function — continued. 


/* check for a device 
/* if we have an extension, can't be a device 
if (IsDevice(buff) ) 
{ 
for(dhp = (struct dhdr FAR *)&nul_dev; dhp != 
(struct dhdr FAR *)-1; dhp = dhp -> dh_next) 
{ 
if(FcbFnameMatch( (BYTE FAR *)buff, (BYTE FAR *)dhp -> 
dh_name, FNAME_SIZE, FALSE)) 
{ 
sftp -> sft_count += 1; 
sftp -> sft_mode = O_RDWR; 
sftp -> sft_attrib = 0; 
sftp -> sft_flags = 
(dhp -> dh_attr & ~SFT_MASK) | SFT_FDEVICE | 
SFT_FEOF; 
sftp -> sft_psp = Cu_psp; 
fbcopy(|pFcb -> fcb_fname, sftp -> sft_name, 
FNAME_SIZE+FEXT_SIZE): 
sftp -> sft_dev = dhp; 
lpFcb -> fcb_sftno = sft_idx; 
lpFcb -> fcb_curec = 0; 
\pFcb -> fcb_recsiz = 0; 
lpFcb -> fcb_fsize = 0; 
\pFcb -> fcb_date = dos_getdate(): 
lpFcb -> fcb_time = dos_gettime(); 
lpFcb -> fcb_rndm = 0; 
return TRUE; 
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In a fashion similar to DosOpen(), FcbOpen( ) checks for a device 
because it must initialize the SFT differently for devices than for disk 
files. If it is not a device, it calls dos_open( ) to open the disk file. Once 
this is done, FcbOpen( ) proceeds to populate the FCB with information 
necessary to both the user and DOS-C. In essence, FcbOpen() per- 
forms many of the same functions as DosOpen(), but functions in a 
slightly different manner because of the method used to call it. 

File reads are handled by FcbRead() (Listing 6.13). As with 
FcbOpen( ), FcbRead( ) accepts a far pointer into user space that points 
to the user’s FCB. The first step that FcbRead() takes in servicing the 


Listing 6.12 The FcbOpen() function — continued. 


sftp -> sft_status = dos_open(buff, O_RDWR); 
if(sftp -> sft_status >= 0) 
( 
lpFcb -> fcb_drive = FcbDrive; 
lpFcb -> fcb_sftno = sft_idx; 
lJpFcb -> fcb_curec = 0; 
lpFcb -> fcb_recsiz = 128; 
lpFcb -> fcb_fsize = dos_getfsize(sftp -> sft_status); 
dos_getftime(sftp -> sft_status, 
(date FAR *)&lpFcb -> fcb_date, 
(time FAR *)&lpFcb -> fcb_time); 
lpFcb -> fcb_rndm = 0; 
sftp -> sft_count + 1; 
sftp -> sft_mode = O_RDWR; 
sftp -> sft_attrib = 0; 
sftp -> sft_flags = 0; 
sftp -> sft_psp = cu_psp; 
fbcopy( (BYTE FAR *)&lpFcb -> fcb_fname, 
(BYTE FAR *)&sftp -> sft_name, FNAME_SIZE+FEXT_SIZE); 
return TRUE; 
J 
else 
return FALSE; 
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system call is to convert the extended FCB to an FCB if necessary. It 
then retrieves the SFT block that contains the sft by using the member 
fcb_sftno. If it cannot retrieve the block, the file was probably not 
opened and FcbRead( ) returns an error. If the FCB correctly links to an 


Listing 6.13 The FcbRead() function. 


BOOL FcbRead(1pXfcb, nErrorCode) 
xfcb FAR *1]pXfcb; 
COUNT *nErrorCode; 


sft FAR *s; 

fcb FAR *\pFcb; 

LONG 1Posit; 

COUNT nRead; 

psp FAR *p = MK_FP(cu_psp,0); 


/* Convert to fcb if necessary 
JpFcb = ExtFcbToFcb(1pXxfcb); 


/* Get the SFT block that contains the SFT 
if((s = FebGetSft(|lpFcb -> fcb_sftno)) = (sft FAR *)-1) 
return FALSE; 


/* If this is not opened another error*/ 
if(s -> sft_count = 0) 
return FALSE; 


/* Now update the fcb and compute where we need to 
/* position to. 
1Posit = ((1pFcb -> fcb_cublock * 128) + IpFcb -> fcb_curec) 
* |pFcb -> fcb_recsiz; 
if(dos_lseek(s -> sft_status, ]Posit, 0) < 0) 
( 
*nErrorCode = FCB_ERR_EOF; 
return FALSE; 
J 


/* Do the read */ 
nRead = dos_read(s -> sft_status, p -> ps_dta, IpFcb -> fcb_recsiz); 
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open file, FcbRead( ) must now update the FCB and compute the cor- 
rect file position. With this position computed, FcbRead( ) updates the 
file with a call to dos_]seek(), returning an error if it cannot properly 
perform the seek. 


Listing 6.13 The FcbRead() function — continued. 


/* Now find out how we will return and do it. 
if(nRead = 1pFcb -> fcb_recsiz) 
{ 
*nErrorCode = FCB_SUCCESS; 
FcbNextRecord(1pFcb); 
return TRUE; 
} 
else if(nRead < 0) 
{ 
*nErrorCode = FCB_ERR_EOF; 
return TRUE; 
J 
else if (nRead == 0) 
{ 
*nErrorCode = FCB_ERR_NODATA; 
return FALSE; 
\ 
else 
{ 
COUNT nIdx, nCount; 
BYTE FAR *1pDta; 


nCount = IpFcb -> fcb_recsiz - nRead; 

lpDta = (BYTE FAR *)&(p -> ps_dta{nRead]); 

for(nIdx = 0; nIdx < nCount; nIdx+) 
*|pDtat+ = 0; 

*nErrorCode = FCB_ERR_EOF; 

FcbNextRecord(1pFcb); 

return FALSE; 
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Once the file pointer is updated, FcbRead() performs the read oper- 
ation through a call to dos_read( ) using the link to the f_node in the 
sft structure member sft_status, transferring the data to the DTA as 
indicated in the ps_dta member. The number of bytes to be transferred 
is the number of bytes in a record for this FCB as stored in fcb struc- 
ture member fcb_recsiz. 

Based on the return, FcbRead( ) will return in any of three ways. If 
the number of bytes read matches the record size, the FCB is updated 
by a call to FcbNextRecord() and returns normally. If it returns fewer 
bytes read than a record size, FcbRead() returns with an EOF error 
code. If the number read is exactly zero, FcbRead() returns an error 
indicating that no data was read. 

Finally, the function FcbClose( ) (Listing 6.14) is called to close the 
file. Again, a pointer to the user’s FCB in user space is passed in and 
operated on. As you saw with FcbRead(), FcbClose() converts the 
user’s extended FCB to an FCB if necessary. It also gets the SFT block 
that contains the sft and returns an error if it was not opened. 

Next, FcbC lose( ) checks to see if the file opened was a device. If it 
was, it decrements the sft_count member of the s ft and returns. Oth- 
erwise, it sets the file’s time, date, and size by calling dos_setftime( ) 
and dos_setfsize(). With this information updated, FcbClose( ) 
calls dos_close( ) and exits. 
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Listing 6.14 The FcbClose() function. 


BOOL FcbClose(1pXfcb) 
xfcb FAR *1pXfcb; 


{ 


sft FAR *s; 
fcb FAR *1pFcb; 


/* Convert to fcb if necessary 
1pFcb = ExtFcbToFcb(1pXfcb); 


/* Get the SFT block that contains the SFT 
if((s = FebGetSft(lpFcb -> fcb_sftno)) = (sft FAR *)-1) 
return FALSE; 


/* If this is not opened another error 
if(s -> sft count = 0) 
return FALSE; 


/* now just drop the count if a device, else 
/* call file system handler 
if(s -> sft_flags & SFT_FDEVICE) 
( 
s -> sft_count -= 1; 
return TRUE; 
} 
else 
( 
s -> sft_count -= 1; 
if(s -> sft_count > 0) 
return SUCCESS; 
else 
( 
/* change time and set file size 
dos_setftime(s -> sft_status, 
(date FAR *)&lpFcb -> fcb_date, 
(time FAR *)&lpFcb -> fcb_time); 
dos_setfsize(s -> sft_status, IpFcb -> fcb_fsize); 
return dos_close(s -> sft_status) == SUCCESS; 





Chapter 7 


Command Line 
Interpreter 


Until now, I have been concerned primarily with the details of the 
DOS-C kernel. I have looked at the system call mechanisms, file sys- 
tem functions, and memory management and task management func- 
tions. I have also dug into the device driver interface and investigated 
the C-to-assembly language interface for calling standard DOS device 
drivers. With all these details exposed, it’s time to look at other compo- 
nents of the operating system, beginning with the user-level commands. 
A good place to start is with DOS-C’s version of command. com. 

You may start by asking ““How does command.com fit in?” In DOS-C, 
it is the first program started by the kernel and the parent of every 
other program started through either a command entered at the key- 
board or sets of commands in a batch file. It is the component of the 
operating system responsible for maintaining the master environment 
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and running the system initialization program, autoexec.bat. In many 
ways, it is similar to the init program that runs on many UNIX sys- 
tems, but it combines the features of command line parsing and initial- 
ization. Unlike init, command.com does not keep a list of programs to 
monitor or maintain multiple levels that allow single-user and multi- 
user operation. 


Starting the Command Line Interpreter 


In Chapter 3, I examined the four stages of a DOS-C boot. In the final 
stage, the DOS-C kemel loads command. com, or any other program speci- 
fied in the config. sys file, and is the first program to run with the DOS-C 
kernel. It establishes the environment for all subsequent programs executed 
under DOS-C. What you don’t know is the method used to start this pro- 
gram. Take a closer look at how DOS-C starts command. com. 

Loading command.com is a four-step process and is fairly straight- 
forward. The reason is simple: if you allow DOS-C users to replace 
command.com with another command line interpreter of their choice, 
you need to get that information from the user. There are many ways to 
do this. You may have a special init program that reads a configura- 
tion file and starts the command line interpreter. This is the UNIX 
model where init spawns a special set of programs to listen to a serial 
line or the console. These programs specifically look for a login and 
then look up the command line interpreter to run from a field contained 
in the /etc/passwd file. You may also build the command line inter- 
preter into the kemel. An operating system that implements this model 
is CP/M. However, DOS-C is based on the MS-DOS model that speci- 
fies the default command line interpreter command.com and loads it 
unless it is directed otherwise by an entry in the file config.sys. The 
DOS-C kernel uses a process broken into three phases to facilitate the 
config.sys switch (Figure 7.1). 

The DOS-C kemel uses a data structure, config, to hold information 
regarding the number of buffers, the number of files, and the name of the 
command line interpreter. This data structure is initialized with default 
values when the kermel is compiled. The reason you initialize this structure 
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with default values is simple: you need to read a file in order to determine 
the user kernel configuration. You need to allocate space for file structures 
and buffers before you can access the file system to read config.sys. 
The kernel uses this initial set to initialize some file system portions of the 
kernel in preparation of processing config.sys in the function 
init_kernel (). It calls the function PreConfig( ) to perform this initial 
allocation of file system data structures. 

With preliminary file system data structures in place, the kernel then 
processes config.sys, replacing the initialized portions of the config 


Figure 7.1 A three-phase process to load an executable file from 
config. sys. 
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data structure with those values parsed from the contents of config.sys. 
One of those structure members is cfgInit. This member contains either 
the string “command.com” as initialized at compile time or the string 
parsed from the command= line in config.sys. Also, the remainder of 
any line parsed from the command= line in config.sys is stored in the 
structure member cfgInitTail. This allows special parameters such as 
stack size or environment buffer size to be passed to the start-up command. 

When the kernel is ready to execute its first process, it invokes the 
function p_0(). This function looks up the program it will execute by 
referring to the command name stored in config.cfgInit and passes 
the command tail config.cfgInitTail. Function p_0() packs this 
into a standard DOS exec_blk data structure and calls DosExec( ), the 
internal version of int 21h Fn OOh, to start the program. By using this 
technique, any standard DOS executable program may be used as the 
start up command. By convention, you start command .com identically 
to the way MS-DOS starts its command line interpreter. Once started, 
the kernel suspends its execution and control passes to the command 
line interpreter. From this point on, the kernel remains in the back- 
ground, only to execute system calls as needed by command.com and 
other application programs. 


command.com 


In DOS-C, the command line interpreter is command .com. As mentioned 
earlier, this is by convention and can be almost any other program. For 
example, during debugging of the DOS-C command.con, the line 


command=command.exe /p 


is inserted into config.sys so that the version executed is the one that 
contains all debugging information normally stripped from the .com 
file. This allows the use of a standard DOS source-level debugger to 
simplify the debugging of the interpreter. 

The DOS-C version of command.com is written in C. No special 
coding tricks or foolery are associated with it. It is a simple program 
designed to accept, parse, and execute user input. It has a number of 
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built-in commands identical to the MS-DOS version and allows the 
grouping of commands into a single file known as a batch file. Like 
MS-DOS, DOS-C command. com batch processing allows for some sim- 
ple program control instructions that test various parameters and mod- 
ify flow of control based on the results of these tests. It also maintains 
the global environment block so that programs have access to the glo- 
bal environment variables. 


Architecture 


The architecture for command.com is simple. It is organized as a table- 
driven executive with a collection of stand-alone functions used to 
implement the internal commands. This executive performs a linear 
search through the table in order to find the function that corresponds to 
the command. If there is a match, the executive executes the corre- 
sponding function; otherwise, it defaults to the terminating entry. This 
process continues forever, repeating the prompt-command cycle for 
every user command. 

Take a closer look at how command.com works. What you see from 
the initial examination is that it is very much a standard C program. Like 
all conventional C programs, execution starts at main( ) (Listing 7.1). 


Listing 7.1 The command.com main() function. 


VOID main() 
{ 
COUNT nread; 
BOOL bool_FLAG = FALSE; 
BOOL cflag; 
BYTE FAR *cmd_tail; 
BYTE *p_ptr; 
extern UWORD _psp; 
psp FAR *p; 
COUNT driveno = -1; 
BYTE pattern[MAX_CMDLINE] = ""; 
BYTE path[MAX_CMDLINE] = "", esize[MAX_CMDLINE] = ""; 
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Listing 7.1 The command.com main() function — continued. 


/* Initialize the interpreter xi) 
p = MK_FP(_psp, 0); 

switchchar = '/'; 

batch_FLAG = FALSE; 

argv(0] = args(0]; 

argv[1] = (BYTE *)0; 

args(OJ(0O] = '\O'; 

*tail = '\0'; 

env = (BYTE FAR *)MK_FP(p -> ps_environ, 0); 
cmd_tail = MK_FP(_psp, 0x81); 

fstrncpy((BYTE FAR *)tail, cmd_tail, Ox7f); 


pflag = cflag = FALSE; 
dosopt ("$d$p*Ce:pc]+", (BYTE FAR *)tail, 
&driveno, path, pattern, esize, &pflag, &cflag); 


/* Get the passed-in Environment size and make certain we */ 
/* allocate enough space */ 
EnvSize = EnvSizeUp(); 
if (EnvSize < ENV_DEFAULT) 

EnvSize = ENV_DEFAULT; 
if(*esize != ‘'\0') 
( 

COUNT size = atoi(esize); 


bool_FLAG = EnvAlloc(size); 
EnvSize = size; 

) 

else 
bool_FLAG = EnvAlloc(EnvSize); 


if (!boo1_FLAG) 
error_message(OUT_ENV_SPACE ); 


/* Check what PROMPT is set in env to over ride default 
p_ptr = EnvLookup( "PROMPT" ); 
if(p_ptr != (BYTE *)0) 
scopy(p_ptr, prompt_string); 
else 
scopy(dflt_pr_string, prompt_string); 
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command .com starts by initializing a number of internal variables. It 
then examines the command tail, searching for options passed when 
DOS-C invoked command.com. These options modify the execution of 
the interpreter. For example, the /p option tells command .com that this 
instance is the root interpreter (since other instances of command.com 
may be invoked) and must not exit, thereby fixing this copy as the root 
process for DOS-C. command.com may also be invoked to execute a 
single command with the /c option, so it must modify its internal code 


Listing 7.1 The command.com main() function — continued. 


/* Check what PATH is set in env to over ride default */ 
p_ptr = EnvLookup("PATH"); 
if(p_ptr != (BYTE *)0) 
scopy(p_ptr, path); 
else 
scopy(dflt_path_string, path); 
if(!cflag) 
{ 
if (pflag) 
{ 
/* Special MS-DOS compatability initialization, 
/* all command shells terminate onto themselves, 
/* but we always terminate at the root shel). 
/* If anyone complains, we'll] change it. 
{Hifndef DEBUG 
p -> pS_parent = _psp; 
ftendif 


/* Try to exec autoexec.bat 
bootup = TRUE; 
*tail = '\0'; 
if(!batch(".\\autoexec.bat", tail)) 
( 

*tail = ‘\0'; 

cmd_date(1, argv); 

cmd_time(1, argv); 

bootup = FALSE; 
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Listing 7.1 The command.com main() function — continued. 


else 
{ 
/* Announce our version 
printf (ANNOUNCE, copyright); 
#Hifdef SHWR 
printf("**** Shareware version ****\nPlease register 
your copy.\n"); 
ffelse 
printf("\n\n"); 
fendif 
} 


FOREVER 
{ 
default_drive = DosGetDrive(); 
put_prompt(prompt_string); 
if((nread = DosRead(STDIN, (BYTE FAR *)cmd_line, 
MAX_CMDLINE)) < 0) 
continue; 
do_command(nread); 


BYTE FAR *p; 


default_drive = DosGetDrive(); 
for(p = cmd_tail; *p != '\r'; pt+) 
{ 
if(*p = '/' && (*(pt+1) = ‘c' [| *(p +1) = 'C')) 
break; 
J 
p += 2; 
fstrncpy((BYTE FAR *)cmd_line, p, Ox7f); 
for(nread = 0; *p != '\r'; nread++, p++); 
+tnread; 
do_command(nread); 
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path to comply with the user’s single-command execution. Other 
options modify command. com buffer space for environment and stacks. 
For these reasons, the parsing of command line variables occurs very 
early on in main(). 

With command line options parsed, main() continues to initialize 
areas that command.com variables affected, such as prompt and path. 
Once completed, it examines whether or not it is in the single-com- 
mand mode or the iterative-command interpreter mode. If it is in the 
single-command mode, the remainder of the command line is copied 
into a local buffer and main() invokes do_command() to execute the 
single command. If it is not executing a single command, command .com 
assumes it is in the interpreter mode. 

The code interpreter mode examines the flags set earlier in main(). 
If command.com encountered the /p option, it knows that it was 
invoked as the root interpreter by the kemel. Within DOS-C, only the 
root interpreter receives the /p option, so command.com attempts to 
execute autoexec.bat. Executing this command on start-up is a DOS 
de facto standard to which all DOS users are well accustomed. Should 
autoexec.bat not be on the boot disk, command. com gives the user the 
chance to update the operating system’s date and time. This 1s also a 
DOS standard and dates back to IBM PC and XT days when there was 
no system clock, and the start-up of DOS was the only opportunity to 
set the system time. If command. com did not receive the /p option, the 
interpreter merely announces its invocation in an attempt to differenti- 
ate itself from the root interpreter. 

Once command.com completes all this preliminary set-up, it falls into 
the command loop, an infinite loop that outputs the prompt and then 
reads from standard input. Once the user enters a command, the buffer 
that received the keyboard input is passed to do_command( ), which exe- 
cutes the command. Because this loop never exits, the sequence repeats 
for as long as the interpreter is active. However, command .com can exit, 
but I will examine this mechanism later. 
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Both code paths, the single-command path and iterative-command loop, 
share the function do_command( ) (Listing 7.2). This is the function that 
actually carries out the execution of individual commands. do_command( ) 
is responsible for input and output redirection. It also parses the command 
tail into an argument list for use by internal commands. 


Listing 7.2 The do_command() function. 


VOID do_command(nread) 
COUNT nread: 
( 
REG struct table *p; 
REG BYTE *lp; 
COUNT index = 0; 
BYTE Input(MAX_CMDLINE], Output(MAX_CMDLINE J; 
BOOL AppendMode; 
COUNT OldStdin = -1, OldStdout = -1, ErrorCode; 
BOOL I0Redirected = FALSE; 


if(nread <= 0) 
return; 
cmd_line(nread] = ‘\0'; 


/* Pre-scan the command line and look for any re-directs */ 
*Input = *Qutput = ‘\0'; 
AppendMode = FALSE; 
Redirect(cmd_line, Input, Output, &AppendMode) ; 
1O0Redirected = (*Input != ‘\O’ || *Output != ‘\0'); 
if(*Input != ‘\0') 
| 
COUNT Handle; 


if(!DosDupHandle(STDIN, (COUNT FAR *)&01dStdin, 
(COUNT FAR *)&ErrorCode) ) 
{ 
RestorelO(OldStdin, -1); 
error_message( INTERNAL_ERR) ; 
return; 
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Both the single-command mode and the iterative-command inter- 
preter share a single algorithm. This algorithm is encompassed by 
do_command( ). When initially invoked, do_command() prescans the 
command line for redirection. This is done through the invocation of 
Redirect(), which scans the command line looking for all entries that 


Listing 7.2 The do_command() function — continued. 


Handle = DosOpen( (BYTE FAR *)Input., O_RDWR); 
if((Handle < 0) |[ (!DosForceDupHandle(Handle, STDIN, 
(COUNT FAR *)&ErrorCode) )) 
{ 
RestorelO(OldStdin, -1); 
error_message( INTERNAL_ERR); 
return; 
J 
DosClose(Handle); 
J 


if(*Output i= '\0') 
( 
COUNT Handle; 


if(!DosDupHandle(STDOUT, (COUNT FAR *)&0ldStdout, 
(COUNT FAR *)&ErrorCode) ) 
I 
Restorel0(-1, OldStdout); 
error_message( INTERNAL_ERR); 
return; 
) 


if (AppendMode) 
| 
if((Handle = DosOpen( (BYTE FAR *)Output, O_ROWR)) < 0) 
( 
Restorel0(-1, OldStdout); 
error_message( INTERNAL_ERR); 
return; 
| 
DosSeek(Handte, 2, 01); 
) 
else 
Handle = DosCreat((BYTE FAR *)Output, 
D_NORMAL | D_ARCHIVE); 
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are preceded by “<” or “>”, which are conventionally used for I/O redi- 
rection. Once this scan is complete, do_command( ) proceeds to change 
the standard input and output file descriptors with the DOSAPI calls 
DosDupHandle(), DosOpen(), and DosForceDupHand1le( ). This has 
the net effect of saving the original file descriptor, so you can recover it 
later, and redirecting the input or output stream. This is done once for 
the input stream and again for the output stream. 

With I/O redirection out of the way, do_command() moves on to 
parsing the command line for possible use by internal commands. The 
internal commands use an internal representation of argc and argv 
similar to that in any conventional C program. Once the argument list is 
built, do_command( ) moves onto execute the command. It does this by 
first looking for the special case of a simple drive change. This is a two- 
character string pointed to by argv[0], where the second character is 


Listing 7.2 The do_command() function — continued. 


if((Handle < 0) || (!DosForceDupHandle(Handle, STDOUT, 
(COUNT FAR *)&ErrorCode)) ) 
( 
Restorel0(-1, OldStdout); 
error_message( INTERNAL_ERR); 
return; 
J 
DosClose(Handle); 
} 


for(argc = 0; argc < 16; argc++) 

{ 
argvCargc) = (BYTE *)0; 
argsCargc][0J = ‘\0'; 

) 

lp = scanspl(cmd_line, args{0], '/'); 


if(args(0]([0] = ‘@') 
{ 


at_FLAG = TRUE; 
index+; 

J 

else 
at_FLAG 
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Listing 7.2 The do_command() function — continued. 


/* If preceeded by a @, swallow it, it was taken care of 
/* elsewhere. Also, Change case so that our command verb 
/* is case sensitive. 

while(args(OJ[Lindex] != '\0') 

{ 


if(at_FLAG) 
args(OJ][index-1] = tolower(args(O][index]); 
else 
args[O]Lindex] = tolower(args(OJ]Cindex])); 
index++; 
J 
if(at_FLAG) 
args(O][index-1] = ‘\0O'; 


argvL0) = args(0]; 
/* this kludge is for an MS-DOS wart emulation 
tail = skipwh(1p); 


for(argc = 1; argc < NPARAMS; argc++) 
( 
lp = scan(1p, args{argc)); 
if(*argsLargc] = ‘'\0') 
break; 
else 
argvCLargc) = args[argc]; 
J 


if(*argv[O) != ‘\0') 
( 
/* Look for just a drive change command, and execute 
/* it if found. 
if(argv(OJ[1) = ':' && argvl0J({2) = NULL) 
( 


BYTE c = argv(O][0]; 


if(c >= ‘a’ && c <= '2') 
c=c-'a' +A’; 

if(c >= 'A' && c <= '2') 
DosSetDrive(c - ‘A'); 
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“’. Should this be the case, the current drive is changed through a 
DOSAPI call to DosSetDrive(), and do_command( ) exits. If it is not a 
drive change, do_command( ) checks for a special request to the Help 
subsystem. If do_command() identifies it as being such a request, it 
builds a Help command line, invokes help.exe and exits. 

Finally, if the command is neither a drive change nor a call to the 
Help subsystem, do_command( ) attempts to invoke either an internal or 
external command to satisfy the user request. It does this through a call 
to lookup( ) (Listing 7.3), which performs the actual command lookup. 
The internal command’s corresponding function is then executed by 
invoking the returned pointer to the internal function. Once the com- 
mand completes, do_command() restores the I/O to its original state 
and exits. 


Listing 7.2 The do_command() function — continued. 


/* Tt may be a help command request. */ 
else if((argv(1](0)] = switchchar) && (argv(1){1] = ‘'?')) 
{ 

strcpy(tail. “ "); 

strcat(tail, argv(0]); 

strcat(tail, "\r\n"); 

argc =2; 

argv(1] = argv(O]: 

argv(0] "help"; 

argv([2] = 0; 

ExecCmd(argc, argv): 

if (10Redirected) 

RestorelO(OldStdin, OldStdout); 

] 
/* do a normal command execution 
else 
( 

p = lookup(commands, argv[0]): 

(*(p -> func))¢argc, argv); 

if( 10Redirected) 

RestorelO(OldStdin, OldStdout); 
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When the default entry is invoked, it attempts to take the command 
parsed from the command line and locate an external program to fulfill 
the user’s request. It follows a predetermined sequence of searches 
based on filename extensions. This is conventionally *.com, *.exe, 
and *.bat, in that order. It spawns the external program, if found, and 
the external command returns to this point when it terminates. 


DOSLIB 


Before I proceed much further, I need to explain what the DOSAPI that 
I mentioned previously is all about. What you have seen is a sequence 
of calls such as DosOpen(), DosSetDrive( ), etc. MS-DOS shows the 
influences of early operating system designs in the assembly language 
system call, wherein MS-DOS, and in turn DOS-C, implements system 
calls as software interrupts, and system call parameters are passed in 
the processor register set. The choice of registers seems arbitrary and 
does not follow any high-level language conventions. Consequently, 
special arrangements must be made for high-level languages such as C, 
in effect hiding the assembly language details from the C program. 


Listing 7.3. The lookup() function. 


struct table *lookup(p, token) 
struct table *p: 
BYTE *token; 
{ 
while(*(p -> entry) != ‘\0") 
{ 
if(strcmp(p -> entry, token) = 0) 
break; 
else 
+p; 
| 
return p; 
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There are many ways to hide the system call details from the C code. 
It is possible to embed the system calls directly in a C function through 
the use of parameterized C macros. This technique is used by operating 
systems such as Linux. It is also possible to implement a message-pass- 
ing scheme where the system call parameters are placed into a message 
packet and sent to the kernel. This is the Minix technique. What I have 
done in DOS-C is define a logical set of C functions that act as the 
DOSAPI. These functions are then specially constructed to accept nor- 
mal C parameters and convert them to the int 21h system calls 
accepted by DOS-C. In doing this, I have defined the DOS-C DOSAPI. 

Examine the DosOpen( ) API call (Listing 7.4) to study the anatomy 
of an API call. The call starts out as a normal C function defining two 
parameters, FileName, normally passed as a far pointer in ds:dx, and 
FileAccess, normally in ah. The function returns a 16-bit integer that 
represents either the handle of the opened file or an error code if negative. 

On starting the function DosOpen( ), the parameter F i] eName is bro- 
ken into segment and offset components suitable for the int 21h call. 
Although you could have deferred this until later, after you make the 
transition to assembly language, the FP_SEG() and FP_OFF() macros 
are used instead. This way, you don’t need to second guess how the 
compiler encodes far pointers. DosOpen( ) then shifts gears and drops 
into assembly language using the C directive asm. Once in assembly 
language, the parameters are recovered from the function invocation 
and moved into the correct registers. DosOpen( ) then loads the function 
number into the al register and proceeds to make an int 21h call. 
When the int 21h call returns, the return values are checked for errors 
and the return value is set accordingly. 

At this point you may be wondering why the API calls are coded in 
this fashion. After all, you could code the API function entirely in 
assembly language or use the more common int21() or intdos() 
functions. Any of these choices would work, but the tradeoffs are 
apparent. If you code the API in assembly language, you must have 
intimate knowledge of how the compiler passes parameters in function 
calls, and the use of int21() and related calls runs the risk of using 
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self-modif ying code or code built on the stack. Although you don’t nec- 
essarily want to put this code in read-only memory, the code should not 
preclude this option. The hybrid approach chosen here represents the 
best compromise between the two and is an approach typically seen in 
many operating system APIs. 


Listing 7.4 The DOSAPI function DosOpen(). 


COUNT DosOpen(FileName, FileAccess) 

COUNT FileAccess; 

BYTE FAR *FileName; 

{ 
UWORD FileName_seg = FP_SEG(FileName) ; 
UWORD FileName_off = FP_OFF(FileName) ; 
COUNT Handle; 


asm |[ 

push ds 

push cx 

mov cx, FileName_seg 

mov dx, FileName_off 

mov al ,byte ptr FileAccess 
mov ds,cx 


mov ah, Ox3d 
int 0x21 


pop Cx 
pop ds 


mov Handle,ax 


out 
ax 
Handle,ax 


} 
out: return Handle; 
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Internal Commands 


I will now look at two internal commands, cd and exit. These two 
functions are representative of internal commands in command.com. In 
the case of cd, the design is virtually identical to all the other internal 
commands. exit, however, is a special case that is used to quit the 
interpreter. Because of this, exit needs to do some special housekeep- 
ing before exiting. Start by examining cd. 

The function cd() (Listing 7.5) starts out by initializing the auto- 
matic variable 0) dDrive. You do this to store the drive that is logged 
in when the flow of execution entered the cd command, so you can 
return to the original logged-in drive in the event that you encounter an 
error anywhere within the function. Next, you do a command line san- 
ity check as a simple method of checking the syntax. This sanity check 


Listing 7.5 The cd() function. 


BOOL cd(argc, argv) 

WORD argc; 

BYTE *argv(]; 

{ 
COUNT OldOrive, NewOrive = 
BYTE CurDir(MAX_CMDLINE] = 


/* Initialize where we are 
OldDrive = DosGetDrive(); 


/* Do command line sanity checks 
if(argc > 2) 
I 
error_message( ]NV_NUM_PARAMS ) ; 
return FALSE; 
) 


/* get command line options and switch to the requested drive */ 
dosopt("$d*", (BYTE FAR *)tail. &NewDrive, CurDir); 
if (NewDrive < Q) 
NewDrive = default_drive; 
DosSetDrive(NewDrive) ; 
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is to make certain that the user only enters commands of the form cd 
orcd d: (or other drive designation). Should a syntax error occur, an 
error is returned to the main command loop. 

With the syntax check out of the way, it’s time to parse the command 
line for options. In the case of cd(), you are looking for an optional 
directory parameter, which may contain an optional drive specification 
that you need to separate. cd( ) uses dosopt( ) to search for an optional 
drive, and if it encounters one, it calls DosSetDrive() to temporarily 
switch to that drive. If no directory is present in the second argument, 
cd simply echoes the current parameter. If one is present, cd() pro- 
ceeds to change the directory, and then switches back to the original 
drive, completing the cd( ) operation. 


Listing 7.5 The cd() function — continued. 


/* Do pwd function for no parameter case 
if(*CurDir = '\0') 
{ 
DosPwd(NewDrive + 1, CurDir); 
printf("Sc:\\%s\n\n", ‘A' + NewDrive, CurDir); 
DosSetDrive(OldDrive); 
return TRUE; 
] 


/* Otherwise, change the directory, and then switch back 
/* to the old directory. 
if((DosCd( (BYTE FAR *)CurDir)) != SUCCESS) 
( 
error_message(INV_DIR); 
DosSetDrive(OldDrive); 
return FALSE; 
] 
else 
( 
DosSetDrive(OldDrive); 
return TRUE; 
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cd illustrates the general design of every internal command. First, it 
performs a syntax check. Next, it proceeds to extract the information it 
needs by parsing the arguments that were passed on invocation. Finally, 
the command is implemented using DOSAPI calls. Error checks are 
performed at each step and a means of recovery is designed into each 
step. When the command exits because of an error or another reason, it 
retums an exit code that indicates either success (TRUE) or failure 
(FALSE). In the event of a failure, the function calls an error routine to 
deliver an error message. All commands within command.com follow 
this design. 

cmd_exit() (Listing 7.6) handles the built-in exit command. The 
design of this function deviates significantly from other built-in com- 
mands because of its special nature. Within DOS, the exit command is 
reserved to terminate any invocation of the command interpreter except 
for the initial one started when DOS starts up. DOS-C follows this 
same convention and supplies an exit command similar to that found 
in any other DOS, including MS-DOS. The one deviation that DOS-C 
makes is that the kernel invokes the initial command.com with a /p 
switch to indicate that it is the permanent shell. 

When you examine cmd_exit( ), you find that it begins with code that 
examines whether or not it is contained within a permanent shell. This 
early check aborts cmd_exit() before proceeding into the body of the 
function, guaranteeing that this invocation of command .com doesn’t exit. 

Another feature of command. com is that it can return values that con- 
vey information, such as error codes back to the program that invoked 
the command. This feature is typically seen in batch files, allowing the 
batch file to return a code indicating either an error condition caused 
the termination of the batch program or successful completion. A retum 
code of zero means, by convention, success. command.com also returns 
success unless otherwise instructed by an argument to the exit com- 
mand. cmd_exit() does this by a simple examination of argc. If there 
is only a single argument, it exits with a success code; otherwise, it con- 
verts the first argument to a numeric value and retums that number as 
the return code. 
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External Commands 


ExecCmd() is the central point where command.com invokes external 
commands. This is an important function since the vast majority of 
DOS commands and applications are invoked through this interface. 


Listing 7.6 The cmd_exit() function. 


BOOL cmd_exit(argc, argv) 
COUNT argc: 
BYTE FAR *argvL]; 
( 
Hi fndef DEBUG 
/* Don’t exit from a permanent shell 
if(pflag) 
return TRUE; 
endif 


/* If no values passed, return errorvalue = 0 
if(argc = 1) 
DosExit(0); 


/* otherwise return what the user asked for 
else 
{ 

COUNT ret_val; 

static BYTE nums[] = "0123456789"; 

BYTE FAR *p; 


for(ret_val = 0, p = argv(1]: isdigit(*p): 
{ 
COUNT j; 


for(j =O; j< 
if (nums[ jj 
break; 
ret_val += j; 
J 
DosExit(ret_val); 
} 
return TRUE; 
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Listing 7.7 illustrates the function ExecCmd( ), which seems excessive 
at first glance. You shouldn’t jump to conclusions too early in the game, 
however, because starting a program and accurately keeping track of its 
environment can consume significant resources. 

Start by looking at the overall design. All commands are entered 
through the command line interface you examined earlier. ExecCmd( ) 
parses the command line and breaks it into an internal argc/argv struc- 
ture similar to the C calling convention for main( ). It then takes the first 
argument and attempts to match it against the internal dispatch table. 


Listing 7.7. The ExecCmd() function. 


BOOL ExecCmd(argc, argv) 
COUNT argc; 
BYTE *argv(); 
( 
exec_blk exb; 
COUNT err; 
BYTE tmppath(64]; 
COUNT idx; 
BOOL ext = FALSE: 
BYTE *extp; 
COUNT len; 
BYTE *1p; 
CommandTail CmdTail: 
fcb fcbl, fcb2; 
static BYTE *extns(2] = 
{ 


-com , 


" exe 
); 
static BYTE *batfile = ".bat”: 
BYTE PathString[MAX_CMDLINE]; 
BYTE Path[MAX_CMDLINE], *pPath; 


/* Build the path string and create the full string that */ 
/* includes ".\" so that the current directory is searched */ 
/* first. Note that Path is initialized outside the loop. */ 
strcpy(Path, ".\\"); 

strcpy(PathString, EnvLookup( "PATH" )); 

pPath = PathString; 
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Listing 7.7 The ExecCmd() function — continued. 


/* Build a path to the command. 

if(*pPath = ‘;") 
++pPath; 

strcpy(tmppath, Path); 

if(*tmppath != '\O' && !((tmppath[{strlen(tmppath) - 

‘\A') =0)) 

strcat(tmppath, "\\"); 

strcat(tmppath, argv(0]); 


/* batch processing 
/* search for an extension in the specification 
for(idx = len = strlen(argv(0]) ; idx > O && 
idx > (len - FEXT_SIZE - 2); --idx) 
{ 
if(argv(OJLidx] = '.') 
{ 
ext = TRUE; 
extp = &argv(O)Lidx]; 
break; 


J 


/* If no extension was found, the entire path was 
/* specified and we do not append an extension. 
if( text) 
{ 

strcat(tmppath, batfile); 

extp = batfile; 
J 


/* if it ends with a ‘.bat’ (either user supplied or 
/* previously added), try to run as a batch. 
if ((stremp(extp, batfile) — 0) && batch(tmppath, tail)) 
{ 
if(pflag && bootup) 
bootup = FALSE; 
return TRUE; 
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The algorithm used for the match stops when a terminating null string 
entry is encountered. The accompanying entry points to ExecCmd( ) 
make it the default function to be invoked when looking for internal 
commands. In this fashion, an external command is treated no differ- 
ently than an internal command. By making it the default entry, you 
have guaranteed that command.com searches the internal commands 
before loading an external command. 


Listing 7.7. The ExecCmd() function — continued. 


/* tail comes in as a String with trailing newline. */ 

/* Convert it to a return only and put it into CmdTail */ 

/* format */ 

CmdTail.ctCount = (argc > 1) ? strlen(tail) : 1; 

strcpy(CmdTail.ctBuffer, " ”); 

strcpy(&CmdTail.ctBuffer(l], (argc > 1) ? tail : ""); 

CmdTail.ctBuffer(CmdTail.ctCount] = ‘\0'; 

if(CmdTail.ctCount < LINESIZE) 
CmdTail.ctBuffer({CmdTail.ctCount] = ‘\0’'; 

rtn_errlvl = 0; 

exb.exec.env_seg = FP_SEG(env); 

exb.exec.cmd_line = (CommandTail FAR *)&CmdTail; 


#Hif PARSEFN 
if(argce > 1) 
{ 
DosParseF ilename((BYTE FAR *)argv(1], 
(fcb FAR *)&fcbl, 0); 
exb.exec.fcb_l = (fcb FAR *)&fcbl; 
} 
else 
exb.exec.fcb_l = (fcb FAR *)0; 
if(argc > 2) 
( 
exb.exec.fcb_2 = (fcb FAR * )&fcb2; 
DosParseFilename((BYTE FAR *)argv([2], 
(fcb FAR *)&fcb2, 0); 
} 
else 
exb.exec.fcb_2 = (fcb FAR *)0; 
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Listing 7.7 The ExecCmd() function — continued. 


#telse 
exb.exec.fcb_l = (fcb FAR *)0; 
exb.exec.fcb_2 = (fcb FAR *)0; 
#endif 


for(idx = 0; idx < 2; idx+) 
{ 
strcpy(tmppath, Path); 
if(*tmppath != '\0' && 
!((tmppath[strlen(tmppath) - 1] '= '\\') = 0)) 
strcat(tmppath, "\\"); 
strcat(tmppath, argv[0]); 
if(!ext) 
{ 
strcat(tmppath, extnsLidx]); 
extp = extns[idx]; 
} 
if(!(stremp(extp, extnsLidx]) = 0)) 
continue; 
if((rtn_errlv] = err = OosExec( (BYTE FAR *)tmppath, 
(exec_blk FAR *)&exb)) 
$= SUCCESS) 


switch(err) 

{ 

case DE_FILENOTFND: 
continue; 


case DE_INVLDFUNC: 
rtn_errlv] = INV_FUNCTION_PARAM; 
goto errmsg; 


case DE_PATHNOTFND: 
rtn_errlvl = PATH_NOT_FOUND: 
goto errmsg; 


case DE_TOOMANY: 
rtn_errlvl = TOO_FILES_OPEN; 
goto errmsg; 
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Listing 7.7 The ExecCmd() function — continued. 


case DE_ACCESS: 
rtn_errivl = ACCESS_DENIED; 
goto errmsg; 


case DE_NOMEM: 
rtn_errlvl = INSUFF_MEM; 
goto errmsg; 


default: 
rtn_errivl = EXEC_ERR; 
errmsg: 
error_message(rtn_errlvl); 
return FALSE; 


} 
else 


( 
rtn_errlvl = DosRtnValue() & Oxff; 


return TRUE; 
) 
J 
if(err < 0 || idx = 2) 
{ 
if(!(err = DE_FILENOTFND || idx = 2)) 
( 
error_message(EXEC_FAIL); 
return FALSE; 
} 
continue; 
J 
} 
while(*Path = '\0', pPath = scanspl(pPath, Path, ‘';'), 
*Path != '\0'); 
error_message(BAD_CMD_FILE_NAME); 
return FALSE; 
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Once ExecCmd() is invoked, it initializes and searches for a com- 
mand in a fixed manner that includes searching for .bat, .com, and 
.exe forms of the command. This search is performed along fixed direc- 
tories as specified in the path environment variable. Once ExecCmd() 
finds the command, it creates the proper environment before executing 
the external command. When the external command returms, ExecCmd( ) 
checks for errors before exiting to the interpreter’s main loop, which 
adds to the robustness of the design. 

Take a closer look at the algorithm behind ExecCmd(). ExecCmd( ) 
Starts by initializing the internal execution path variable. It builds the 
path string and creates a path that begins with “\” so that the current 
directory is searched first. Next, ExecCmd() uses aC do ... while 
construct to repeat the search along each directory contained within the 
string constructed earlier. For each iteration, ExecCmd() builds a .bat 
form of the command and searches for it by invoking the function 
batch() (Listing 7.8). If it is a batch file, ExecCmd() returns; other- 
wise, it tries to load a binary executable. 

To execute a binary (.com or .exe) file, ExecCmd( ) uses an exec 
block data structure for use inan int 21h fn 4bh. It starts by creating a 
command tail for use in the system call. It also initializes the exec block 
with the master environment segment. Next, it initializes the two fcb 
entries by parsing the command line using the Parse Filename system call 
(int 21h fn 29h). Finally, it attempts to execute a .com version then an 
. exe version of the command through the DosExec( ) API call. The order 
of .comthen . exe is controlled by order of entry in the extns table. 


Listing 7.8 The batch() function. 


BOOL batch( file) 

BYTE *file; 

( 
COUNT idx; 
COUNT file_parse(); 
BOOL parse(); 
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Listing 7.8 The batch() function — continued. 


/* check to see if currently processing a batch file 
/* if so clean up 
if (batch_FLAG) 
( 
COUNT fp; 


if((fp = DosOpen( (BYTE FAR *)file, O_RDONLY)) < 0) 
return FALSE; 

else 
DosClose(fp); 

batch_FLAG = FALSE; 

for(idx = 0; idx < NPARAMS; idx++) 
posparam{idx][0] = '\0': 

for(label_cnt = 0; label_cnt < MAX_LABELS; label_cnt++) 

( 
Jabels[label_cnt].1b_name[0] = ‘\0'; 
labels[label_cnt].1b_posit = '\0O'; 

J 

DosClose(fileptr); 

J 


/* open batch file for reading 
if((fileptr = DosOpen((BYTE FAR *)file, O_RDONLY)) < 0) 
{ 
batch_FLAG = FALSE; 
return FALSE; 
} 


/* Ok, now set mode to batch and initialize positional 

/* parameter array. 

echo_FLAG = TRUE; 

batch_FLAG = TRUE; 

shift_offset = 0; 

for(idx = 0; idx < NPARAMS; idx++) 
strcpy(posparamLidx], argsLidx)); 


/* deal with command line 
file_parse(fileptr); 
return TRUE; 
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On return from the DosExec() API call, ExecCmd() examines the 
return code for error conditions. It does this for two reasons. First, it 
needs to see if it found the command. DosExec( ) uses this return infor- 
mation to proceed to the next entry, if necessary. It does this with a sim- 
ple C continue statement, bypassing all other error tests. Second, it 
needs to translate the error codes into useful user feedback in the form 
of error messages. These messages are an absolute necessity for the 
user. Without them, command .com would be nearly useless. 


Batch Commands 


Batch commands execute in a fashion similar to external commands but 
with significant differences. Unlike a binary file, command.com exe- 
cutes these commands by sequentially reading a text file that contains a 
sequence of commands. It reads the file one line at a time, interpreting 
each line. 

Command line interpreters in many operating systems, such as UNIX 
and its derivatives, possess similar features. Unfortunately, the DOS 
batch command language it is not nearly as well-structured as these 
other interpreters. The batch language that DOS uses tends to be very 
simple with not much support for flow control or structured program- 
ming. In fact, some language features such as the ca! 1 batch command 
did not occur until late in the development of MS-DOS. Without the 
cal] batch command, you cannot nest batch commands simply, and the 
lack of this command was a problem. Many users found batch program- 
ming without the cal] command extremely restrictive, and it took quite 
a bit of demand from the user community before Microsoft responded. 

The restrictive DOS batch language is somewhat unfortunate for the 
DOS-C design. One of our design goals is that DOS-C be functionally 
equivalent to MS-DOS. Maintaining compatibility with the standard 
DOS batch language is necessary to prevent breaking applications that 
rely on it. There is quite a bit of risk associated with deviating from the 
standard because many batch files in the DOS software pool may be 
affected by the changes. For example, these batch files are used in various 
applications as control programs coordinating multiple subcommands, in 
installation procedures as installation scripts, and by many users for 
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local customization. The command .com batch language is also the script 
language used by the system start-up command autoexec.bat. As a 
result, the implementation of the command.com batch scripting language 
should match the MS-DOS version. This makes the design somewhat 
cumbersome in spots. In fact, in the process of matching the original 
MS-DOS design, you quickly understand how unstructured assembly 
language coding led to the unstructured batch language design. 

Look at the DOS-C method used to execute a batch command. Batch 
commands are initiated in command. com through the invocation of the 
DosExec() function. ExecCmd( ) attempts to execute batch commands 
before binary commands and uses a call to the function batch() to 
execute the batch command. If the command is not a batch command, 
batch( ) returns a FALSE, allowing ExecCmd( ) to continue. However, if 
it is a batch command, batch() executes the batch file by reading it 
line by line and executing each line. 

The function batch() takes only a single argument, a pointer to a 
string representing the batch command to be executed. Armed with the 
filename, batch() attempts to open the batch file, but only if it is not 
currently processing one, as indicated by the global variable 
batch_FLAG. If it is in the middle of a batch file, batch() first checks to 
see if the file can be opened. If it is not opened, it returns, thereby allow- 
ing for error processing. If the file is good, it’s opened and internal vari- 
ables are reset. At this point you may wonder why you need all these 
steps before ever getting to the meat of batch processing. The answer is 
simple, if not obscure: the code is here to properly handle the chaining 
of batch files, a command .com feature heavily relied on. It is the first of 
many special code snippets necessary for correct MS-DOS emulation. 

In any event, with the possible exception of a bad batch file specifi- 
cation, batch() proceeds to open the new batch file for input, again 
keeping track of failures to allow for error processing. It then initializes 
a number of internal variables and begins parsing the file for commands 
through a call to file_parse() (Listing 7.9), returning TRUE as an 
indicator for success on completion. 
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Listing 7.9 The file_parse() function. 


COUNT file_parse(fileptr) 
WORD fileptr; 
{ 
WORD rc_FLAG; 
COUNT nread; 
COUNT pass; 
BYTE 1 ine[MAX_CMDLINE]; 
BYTE *p; 


for(pass = 1; pass <= NUM_PASSES; pass++) 
{ 
default_drive = DosGetDrive(); 
do 
{ 
BOOL eof = FALSE, eol = FALSE; 


for(*line = 0. p = line, nread = 0; 
'eol && !eof && (mread < MAX_CMDLINE); ) 
( 
if(DosRead(fileptr, (BYTE FAR *)p. 1) != 1) 
{ 
eof = TRUE; 
break; 
} 
switch(*p) 
{ 
case CTL_2: 
case ‘\r': 
*p='\0'; 
eo] = TRUE; 
break; 


case ‘\n': 
*p = '\0'; 
continue; 


default: 
++nread; 
++); 
break; 
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Listing 7.9 The file_parse() function — continued. 


expand(cmd_line, line); 


default_drive = DosGetDrive(); 
p = skipwh(cmd_line); 


/* dummy nread to get by a blank line 
if(pass > 1 && *p = ‘\D') 
{ 

nread = 1; 

i f (echo_FLAG) 


if(eof) 

printf("\n"); 
put_prompt( prompt_string); 
printf("\n"); 


J 
if(!eof) 
continue; 
else 
break; 
} 


if(echo_FLAG && pass > 1 && *p != '@' && *p != ':' && 
nread > 0) 
( 
printf("\n"); 
put_prompt(prompt_string); 
printf("%s\n", cmd_line); 
} 


if((rc_FLAG = parse(cmd_line, strlen(cmd_line), pass)) 
!= TRUE) 
( 
printf("ERROR parsing batch file\n"); 
return (rc_FLAG); 
} 
| while (nread > 0); 
DosSeek(fileptr, 0, OL); 
} 


DosClose(fileptr); 
batch_FLAG = FALSE; 
return TRUE; 
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The file_parse( ) function is heart of batch processing. This func- 
tion implements the basic structure of the batch command language. 
file_parse() builds a symbol table to keep track of labels and is 
responsible for expanding batch variables. Take a closer look at how 
this is done. 

The basic control structure of file_parse() is a for loop that 
loops through the body of the file_parse() code. file_parse() 
loops through the batch file twice. The first time, it goes through and 
builds a label table that contains an entry for each label plus the file off- 
set for that label. This allows the flow of control functions, such as 
goto, to quickly position to the correct offset in the file. In each pass, 
file_parse() first builds a line from the file, terminating on either 
end-of-file, carriage return, or line feed. Once the line is read, substitu- 
tions are made for each command.com variable, such as %0 through 29 
and other shell variables. It does this by a call to the function 
expand( ). Once the line is expanded, it echoes the line if needed, then 
executes the command by a call to parse( ) (Listing 7.10). 

The function parse() is the final link in batch processing. parse( ) 
modifies its behavior based on the pass variable. If pass is equal to 1, a 
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label, signified as a string preceded with a “:”, is entered into the symbol 


Listing 7.10 The parse() function. 


BOOL parse(line, nread, pass) 
BYTE line[]; 
COUNT nread; 
COUNT pass; 
{ 
cursor = skipwh(line); 


if(pass = 1) 
{ 
if(*cursor = ‘':') 
return label_bat(++cursor); 
else 
return TRUE; 
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table by a call to ]abe]_bat( ). All other commands are bypassed. For 
all other pass values, the label lines are bypassed, but command lines 
are executed by calling do_command( ), which is the same function that 
parses each individual command line while in the interactive mode. 

As you can see, a batch file is handled nearly identically to individual 
commands typed in by the user. The only exception is that you keep 
track of labels. What you don’t see is how flow of control is executed. If 
you look at the dispatch table, you see that you have built-in commands 
for if and goto. In the DOS-C version of command. com, each statement 
that controls flow is a built-in command similar to the dir command. 
Look at the goto command to understand how these commands work. 

All flow of control statements are similar in design. Like 
goto_bat( ) (Listing 7.11), they initially examine the batch_FLAG and 
retum immediately if command.com is not in batch mode. This flag dis- 
ables these commands during interactive operation and is set only when 


Listing 7.10 The parse() function — continued. 


switch(*cursor) | 
case ‘@': 
at_FLAG = TRUE: 
cursor = skipwh(++cursor); 
do_command(nread); 
at_FLAG = FALSE: 
break; 
case ':' 
/* labels processed in pass 1 
break; 
default: 
do_command(nread): 
break; 
} 
return TRUE; 
} 
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command .com is in batch mode. You need to disable flow of control 
Statements during interactive operation since you cannot reposition the 
input file (the keyboard in this case). 

When a flow of control statement needs to transfer the command 
flow to a new location denoted by a label, it searches the symbol table 
for the label. When it finds the label, the function retrieves the corre- 
sponding file offset and uses it to reposition the file, effectively per- 
forming a jump to the statement following the label. For conditional 
commands, the evaluation of the condition prior to performing the 
lookup modifies the batch file execution, but the code is virtually iden- 
tical. Although not complicated, the method is very effective. 


Listing 7.11 The goto_bat() function. 


BOOL goto_bat() 

| 
COUNT label_cnt = 0; 
BYTE *lp; 


if (!batch_FLAG) 
return FALSE; 


lp = skipwn(tail); 


while(labels{label_cnt].1b_name != '\0') 
| 
if(strcmp(labels(label_cnt].]b_name, 1p) = 0) 
{ 
DosSeek(fileptr, 0, (LONG)labels({label_cnt].1]b_posit); 
return TRUE; 
| 
Jabel_cnt#; 
J 
error_message(LABEL_NOT_FOUND) ; 
return FALSE; 
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Other Options for Command Interpreters 


Other command line interpreters are available for MS-DOS, and each 
has its advantages and disadvantages. However, this chapter demon- 
strates the fundamentals of operating system command line interpret- 
ers. In general, these techniques are similar to those used in other 
command line interpreters found in other operating systems. Although 
the system calls may be different, the general principles are the same. 

The DOS-C version of command.com is not a full implementation 
for MS-DOS. Two features are noticeably missing. First, the MS-DOS 
command.com makes a special int 28h call that is used to multiplex 
print.com while idle. MS-DOS needs this because of its single thread 
of execution. It must cooperatively share its thread of control with 
print.com. Also missing are pipes. MS-DOS does not support inter- 
process communications, again because of its single thread of execu- 
tion. Pipes are implemented as temporary files in MS-DOS that are 
deleted after being read. This is a simple but effective technique that 
simulates pipes well. 

These are features that can be easily added to our command.con, if 
so desired. As with all of DOS-C, the source is yours to enhance and 
change as you please. 


Chapter 8 


DOS-C Kernel: 
Putting [t All Together 


If you have read the book to this chapter, you now have an understand- 
ing of the design behind DOS-C. You have gone over the high-level 
design and taken different tours through the kernel to examine various 
aspects of the kernel. You have studied some of the fundamental kemel 
data structures and discovered how these data structures relate to both 
DOS-C and MS-DOS. But you haven’t built DOS-C yet. 

This chapter covers the actual building of the kernel and debugging 
techniques. The concept of building a program and debugging it with 
the use of remote debuggers may be new to many readers, so I will go 
through a step-by-step build and test session of the kemel. 
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Organizing the Project 


Any software project usually has some structure associated with it. In 
the smallest of projects, all the necessary files are usually kept in a sin- 
gle directory. In larger projects, individual components are kept in sep- 
arate directories and built in one common area. DOS-C follows the 
latter model. 

At the highest level, the project is organized into five directories: 
lib, doc, hdr, dist, and src (Figure 8.1). Each of these directories 
holds a major component of DOS-C. The src directory is further 
divided into subdirectories: ip], boot, command, drivers, fs, kernel, 
misc, utils, and tmp. This organization adds structure to the project. 
Is every project organized this way? No, not necessarily, and this direc- 
tory structure may not make sense for another project. You may not 
even organize your own operating system project this way, but the 
object is to organize your files in a way that makes the overall project 
easy to maintain. 

At the top level, the dist directory is an abbreviation for distribu- 
tion and holds all files necessary for a binary distribution. It is essen- 
tially an image of a bootable floppy that you would give to someone to 


Figure 8.1 DOS-C directory structure. 
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--dist 
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--boot 
- + Command 
--drivers 
--fs 
--kernel 
--misc 
--utils 
--tmp 


+- 
+- 
+- 
+- 
+- 
+- 
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try DOS-C. In this directory are boot.bin, ip]l.sys, kernel .exe, 
command.com, other executable files, and documentation, including 
release notes. Had this been a larger project, this directory may have 
been further subdivided. For example, it may have been divided into 
one directory for each disk of a distribution. Another possible organiza- 
tion is to use a single directory but have other areas for distribution on 
different media. Whatever makes the most sense for your project should 
guide your implementation. For DOS-C, the binaries are distributed 
two ways: a compressed archive and a snuggle floppy. For this reason, 
it makes sense that all the files are collected into a single directory, and 
distribution is built from that directory. 

The doc directory is where all project documentation goes. This 
directory includes informational files, release notes, to-do lists, and any 
other project-related documentation. Had DOS-C been bigger, the 
directory may have been further divided into separate directories. 
DOS-C is an active project, and as it grows, so will the documentation. 
This directory structure may change later, but for now, a single direc- 
tory with one file per documentation item is all that is necessary. 

The next three directories are dedicated to building the executables. 
They are 1ib, hdr, and src, abbreviations for libraries, headers, and 
source, respectively. The first of these directories, 1ib, is a common 
directory that holds libraries built in the project. The next directory, hdr, 
holds all header files used in the project. The final directory, src, is 
divided into subdirectories, each dedicated to a component of the project. 

Again you may wonder about the structure. Simply, you need to 
make certain that all key components, such as headers and libraries, are 
kept in areas that are easily accessible from the lower source levels. The 
general rule for DOS-C is that any project-specific header or library is 
always two directories up from the one you are in when you build the 
file. This simple organization simplifies the developer’s task and elimi- 
nates duplication of files and just plain losing files. Again, this structure 
may not be ideal for every project, but any project that is large enough 
should have a file structure. This way, all members of a project team 
know where all the files are located. 
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When you organize your project, examine the project. Try to organize 
the directory structure in a way that makes sense. In the case of DOS-C, 
the directories are organized by functionality, but for your project you 
may want to divide files into one directory per executable. It all depends 
on what makes sense for your particular development environment. 

You will also need to consider your build tools. Some make pro- 
grams include concepts such as view pathing. This is a technique where 
multiple directories specified in a makefile variable are searched for 
dependencies, much the same way that command.com searches multiple 
directories in its path variable. In this case, you may want to create a 
phantom directory structure for work in progress and use the view path 
to pick up files from the source tree that are needed for the build but use 
local files that may be under development. 


Building the DOS-C Kernel 


The build organization of DOS-C is a separate makefile for each major 
component. Each makefile is named, with a .mak extension, after the 
component it builds. For example, the makefile for kernel .exe is 
kernel .mak (Listing 8.1) and for ip] .sys is ip] .mak. This allows for 
rapid identification and association of the component and the makefile. 
There is no overall makefile. To build the entire package, invoke the 
batch file build.bat that resides in the top directory. 


Listing 8.1 Kernel makefile kernel .mak. 


i 
i] Makefile for Borland C++ 3.1 for kernel.exe 


i 
i} $Header: 0:/dos-c/src/kernel/kernel.mav 1.0 02 Jul 1995 6:30:22 
t 


# $Log: O0:/dos-c/src/kernel/kernel.mav $ 
i 

# Rev 1.0 02 Jul 1995 8:30:22  patv 
#lnitial revision. 

il 

RELEASE = 1.00 


-AUTOOEPEND 
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Each makefile has, at a minimum, four targets. These targets are 
used by the top build file and also standardize the operations across 
each DOS-C component. These targets, al], production, populate, 
clobber, and clean, define common makefile operations that you can 
perform for each DOS-C component. The target al] builds every 


Listing 8.1 Kernel makefile kernel.mak — continued. 


i 

# Compiler and Options for Borland C++ 

dt 

CC = bec +kernel.cfg 

ASM = TASM 

LIB = TLIB 

LINK = TLINK 

LIBPATH = . 

INCLUDEPATH = ..\HOR 

CFLAGS = -v -X -J. -D__STOC_=0 -ODEBUG -OKERNEL -DI86 -OPROTO -OSHWR -OASMSUPT 
AF LAGS = /Mx/2i/OSTANDALONE=1 

LIBS =..\..\LIB\OEVICE.LIB ..\..\LIB\LIBM.LIB 


it where to copy source from 
FSSRC = \ 
..\fs\fatfs.c \ 
..\fs\fatdir.c \ 
..\fs\fattab.c \ 
..\fs\dosfns.c \ 
..\fs\fcbfns.c \ 
..\fs\error.c 
SUPTSRC = \ 
..\fs\prf.c \ 
..\fs\misc.c \ 
..\fs\dosnames.c \ 
..\fs\syspack.c 
IOSRC = \ 
.-\fs\blockio.c \ 
..\fs\chario.c 


#t what to delete when cleaning 
COPIEDSRCA = \ 
fatfs.c \ 
fatdir.c \ 
fattab.c \ 
dosfns.c \ 

fcbfns.c 
COPIEDSRCB = \ 
error.c \ 
prf.c \ 
misc.c \ 
dosnames.c \ 
syspack.c \ 
blockio.c \ 
chario.c 
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executable for that component. It may consist of one or more .bin, 
.comor .exe files. The target populate copies all source files into the 
directory. DOS-C uses this target to copy files from fs, the file system 
manager directory, into ip] and kernel before building the file. This 


Listing 8.1 Kernel makefile kernel .mak— continued. 


ft *Implicit Rules* 
-£.0D): 
$(CC) cist 


-Cpp.ubJ: 
$(CC) -c isk J 
i *List Macros* 


EXE_dependencies = \ 
kernel.obj \ 
blockio.obj \ 
cChario.obj \ 
dosfns.obj \ 
dsk.obj \ 
error.obj \ 
fatdir.obj \ 
fatfs.obj \ 
fattab.obj \ 
fcbfns.obj \ 
initoem.obj \ 
inthndlr.obj \ 
joctl.obj \ 
Mmain.obj \ 
menungr.obj \ 
misc.obj \ 
dosnames.obj \ 
prf.obj \ 
strings.obj \ 
sysclk.obj \ 
syscon.obj \ 
syspack.obj \ 
systime.obj \ 
task.obj \ 
apisupt.obj \ 
asmsupt.obj \ 
execrh.obj \ 
procsupt.obj \ 
slacks.ubj 


*Explicit Rules* 
all: production 


production: pupulgle kernel .exe 
tdstrip kernel .exe 
copy kernel.exe ..\..\disl 
del *.obj 
del kernel .exe 
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Listing 8.1 Kernel makefile kernel .mak — continued. 


populate:  $(FSSRC) $(SUPISRC) $(1OSRC) 
release $(RELEASE) 
..\utils\pop $(FSSRC) 
..\utils\pop $(SUPTSRC) 
.-\utils\pop $(10SRC) 


clobber: clean 
del kernet.exe 


clean: 
..\utils\rmfiles $(COPIEDSRCA) 
..\utils\rmfiles $(COPIEOSRCB) 
del *.obj 
del 
del 
del 
del 
del 
del 


* bak 
*cort 
*xrf 
* map 
«lst 
* las 


kernel.exe: kernel.cfg $(EXE_dependenc ies) 
S(LINK) /m/v/c/P-/L$(LIBPATH) @8& | 

kernel .objt 

blockio.obj+ 

chario.objt+ 

dosfns.obj+ 

dsk.obj+ 

error. obj+ 

faldir.objt+ 

fatfs.obj+ 

fatlab.obj+ 

fcbfns .obj+ 

initoem.obj4 

raithndlr .obj+ 

tocti.ubj! 

main.obj+ 

memngr .obj+ 

misc.obj+ 

dosriames.olyj! 

prf.objt 

strings.obj! 

sysclk.obj+ 

syscon.obj+ 

syspack.obj+ 

syst ime.obj+ 

Lask.obj+ 

apisupl.objt 

asmsupt .objt+ 

execrh.obj+ 

Procsupt.ob j+ 

stacks.obj 

kernel # exe file 

kernel # map file 

$(LIBS) 

J 
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way, only one copy of each source file exists, and software maintenance 
is simplified. Both targets clean and clobber clean the directory of 
files that are left over after a build occurs. The only difference is that 
clean leaves the product files, whereas cl obber leaves only the source 
files that belong in the directory. The target production creates the 
actual version that appears in the dist directory. This version is stripped 
of all debugging information and built without debugging options. 


Listing 8.1 Kernel makefile kernel .mak— continued. 


i! *Individual File Dependencies* 
kernel .obj: kernel.cfg kernel.asm 
S(ASM) S(AFLAGS) KERNEL .ASM, KERNEL. OBJ 
blockio.obj: kernel.cfg blockio.c 
Chario.obj: kernel.cfg chario.c 
dosfns.obj: kernel.cfg dosfns.c 
dsk.obj: kernel.cfg dsk.c 
error.obj: kernel.cfg error.c 
fatdir.obj: kernel.cfg fatdir.c 
fatfs.obj: kernel.cfg fatfs.c 
fattab.obj: kernel.cfg fattab.c 
fcbfns.obj: kernel.cfg fcbfns.c 
initoem.obj: kernel.cfg initoem.c 
inthndir.obj: kernel.cfg inthndir.c 
jioctl.obj: kernel.cfg ioctl.c 
Main.obj: kernel.cfg main.c 
memmgr.obj: kernel.cfg meumgr.c 
misc.obj: kernel.cfg misc.c 
dosnames.obj: kernel.cfg dosnames.c 
prf.obj: kernel.cfg prf.c 
strings.obj: kernel .cfg strings.c 


sysclk.obj: kernel.cfg sysclk.c 
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Now build DOS-C. First, change directory to the DOS-C root direc- 
tory. Next, enter the command bui1d and the process begins. The batch 
file changes directory to each component directory and executes make 
with the target production specified. This usually takes a few minutes. 
When it completes, the dist directory contains all the executable com- 
ponents of DOS-C. This is a simple procedure to build an entire operat- 
ing system. 


Listing 8.1 Kernel makefile kernel .mak — continued. 


syscon.obj: kernel.cfg syscon.c 
syspack.obj: kernel.cfg syspack.c 
systime.obj: kernel.cfg systime.c 
task.obj: kernel.cfg task.c 


apisupt.obj: kernel.cfg apisupt.asm 
$(ASM) S(AFLAGS) APISUPT.ASM,APISUPT.OBJ 


asmsupt.obj: kernel.cfg asmsupt.asm 
$(ASM) SCAFLAGS) ASMSUPT.ASM,ASMSUPT .OBJ 


execrh.obj: kernel.cfg execrh.asm 
$(ASM) SCAFLAGS) EXECRH.ASM, EXECRH.OBJ 


procsupt.obj: kernel.cfg procsupt.asm 
$(ASM) S(AFLAGS) PROCSUPT .ASM,PROCSUPT .OBJ 


stacks.obj: kernel.cfg stacks.asm 
$(ASM) S(AFLAGS) STACKS.ASM, STACKS. OBJ 


proto.h:  $(CSRC) $(LIBCSRCI) $(LIBCSRC2) $(LIBCSRC3) $(LIBCSRC4) 
echo /* proto.h generated by make */ > proto.h 
echo #undef _P >> proto.h 
.-\utils\proto $(CSRC) 
..-\utils\proto $(LIBCSRC1) 
.-\utils\proto $(LIBCSRC2) 
..\utils\proto $(LIBCSRC3) 
..\utils\proto $(LIBCSRC4) 
echo Don’t forget to edit proto.h for version control 


i *Compiler Configuration File* 
kernel .cfg: kernel .mak 
copy &&| 
$(CFLAGS) 
| kernel .cfg 
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Now, look at another potential activity. Add a feature to the DOS-C 
kernel. Add function 80h, which outputs a message when invoked. You 
will need to modify the file inthnd]r.c to add the new system call and 
atest program, test80.c (Listing 8.2), to test the new feature. 

The new function call is simple. Use the built-in printf () function to 
output the test message and the contents of the a] register. Then negate the 
al register and returm. It easily could have been a more complex call, sim- 
ilar to some of the other DOS-C system calls, but this serves the purpose. 

Edit inthnd|r.c (Listing 6.4) and add the code from Listing 8.3 to 
the int2l_service() switch. Build the new kermel in the kernel 
directory with the command make -f kernel kernel .exe to build a 
debug version of the kernel. The make builds the kernel and when it’s 
complete, you proceed to make a test disk. 


Listing 8.2 Test program test80.c. 


#Hinclude <stdio.h> 
fHinclude <dos.h> 


iidefine TESTFN 0x80 
void main(void) 


\ 
union REGS regs; 


printf("Start test --\n"); 
regs.h.ah = TESTFN; /* our new system call 
regs.h.al = Oxl2; 


intB6(Ox21, &regs, &regs); 
printf("Done.\n Function returned Ox%02x\n", regs.h.al); 


Listing 8.3 Newsystem call 80h. 


case 0x80: 
printf("System Call received Ox%02x\n", r -> AL); 
r -> AL=-(r -> AL); 
break; 
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First, switch to the dist directory and use the sys command to cre- 
ate a bootable floppy disk. Next, switch back to the kernel directory 
and copy kernel .exe onto the disk, overwriting the distribution ker- 
nel. Finally, copy test80.exe to the disk. It is that simple to create a 
test disk. In fact, once you have a bootable floppy disk, all you need to 
do is copy kernel .exe to it because the loader, ip] .sys, has the capa- 
bility to load EXE files. Reboot the system and run test80. It returns 
the message “System Call received Oxee”’. 


Testing the Kernel 
(or Did I Really Want to Do This?) 


Now comes the fun part — testing your new kernel. Many times, you will 
make a change and the kernel will simply not work. The question is: what 
do you do then? Take a look at debugging techniques that may be used to 
debug any stand-alone operating system or embedded application 

You have a number of debugging options. The first is to place spe- 
cial messages throughout code. This technique proves quite useful and 
is quick, low cost, and effective when the program that you are debug- 
ging is running, just incorrectly. 

With DOS-C, this option is built into the kernel at a few key places. 
Also, if you look at int 21h function 33h subfunction Of dh and Of fh, 
you will see two debug toggles used for read and write system calls and 
system calls in general. These prove quite helpful in determining what a 
user program is doing. 

However, this type of debugging generally cannot handle tougher 
problems. For example, wild pointers and uninitialized variables 
require guess work to specifically dump the affected area. Often, this 
type of problem doesn’t manifest itself until a particular application or 
sequence of events occur. That is when guess work and iterative tests 
become time consuming. 
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However, there are other software alternatives. The first is simula- 
tion. Using this technique, you build a special version of the program 
and run it under a well-behaved operating system. DOS-C has a special 
version called kdb. kdb is a small command line interpreter that simu- 
lates system calls into the kere] and allows the user to set and examine 
vaniables. It is particularly useful during the initial debugging stages 
when algorithms are being tested and if a good source-level debugger is 
available for the particular C compiler you are using. 

Another especially useful technique is remote debugging. With this 
technique, you first load a small debugger onto the target system or in 
ROM. Then, using the host, load the program to be debugged onto the 
target system. The host system then allows you to examine levels, sin- 
gle step, set break points, and perform other general source-level 
debugging. Much of DOS-C application testing was done using a com- 
mercially available remote debugger. The package used during DOS-C 
development required building a special target program that included 
some special startup code. To handle this, a special makefile was cre- 
ated that built the special debug target. However, these systems cost 
many hundreds of dollars. 

The disadvantage of this technique is that when your target system 
crashes, you may have no way of performing a post mortem, especially 
if the code you are testing is in RAM. A wild pointer or an infinite loop 
can quickly overwrite the program or the data area with usually no way 
of stopping or breaking the program. 

The optimum technique is to use an in-circuit emulator, which has 
all the advantages of remote debugging, but with additional hardware 
support. With an in-circuit emulator, you remove the processor from the 
target under test and replace it with an umbilical to special hardware 
connected to the host. This hardware simulates the processor and adds 
special hardware break points not normally available to remote soft- 
ware debuggers. For example, if you have a particular variable that 
mysteriously changes during debugging, you can set a hardware break 
point on a wnite to that one location or range of locations. When a write 
occurs, the in-circuit emulator will stop and show you exactly which 
instruction in your code is doing the errant write. 


DOS-C Kernel: Putting It All Together — 287 


In-circuit emulators also have other features, such as trace stacks, 
which store some past sequence of instructions. You can tum on the 
trace buffer and stop, then look at all the history within the buffer. This 
is especially useful when examining loop terminations and hardware 
interrupt timing. Some models even have built-in profiling logic that 
will do a profile of your code nonintrusively. 

So, what’s the catch? Cost. In-circuit emulators typically cost thou- 
sands of dollars and some are literally full-blown workstations that run 
into the tens of thousands. The price for debugging ease may be expensive. 

It is all a matter of your application and budget. If you are writing 
the operating system for a multimillion dollar realtime application, 
multiple in-circuit emulators are probably appropriate. If you are doing 
a midnight engineering job on a budget, remote debugging or simple 
printf() statements will do fine. It is up to you to determine your 
debugging technique. 


Where Do You Go From Here? 


It is all up to you. I have presented you with a look into an operating sys- 
tem that performs like a popular well-known operating system, MS-DOS. 
What you do with it is your choice. 

You can use it to customize your environment and build special fea- 
tures into it to support your application. You can also make some minor 
changes and embed it into an application as the embedded operating 
system, or you can simply change main() and fall directly into your 
application. Or you can use it as an example to better understand the 
principles of an operating system. 

The choice is yours. You now have a good understanding of DOS-C 
and the source code is for you to use within the DOS-C license terms. 
Use it as a tool and good luck. 


Appendix A 


A Note about 
Portability 


Portability is an issue from which hardly anyone can escape. Develop- 
ers encounter pressures daily, from management and customers alike, 
to develop portable code. In a typical project, it may be desirable to use 
the same source code for different platforms in order to create a wider 
market for the end product. Also, with technology rapidly changing, the 
code developed today will probably be reused on the next generation of 
hardware, necessitating portable design techniques. Additionally, shar- 
ing the code base among different processors greatly eases software 
maintenance problems. 
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The C programming language is a major contributor to portability. It 
is ubiquitous and standard. By using C, today’s developer accomplishes 
a significant degree of portability but is also lulled into a false sense of 
security by the language’s features. C library functions may behave 
differently with different operating systems. There are also semantic 
problems with the language itself. For example, integer size is variable, 
depending on the target machine. With segmented architecture proces- 
sors, another variable is pointer size and memory organization when 
the pointer is in memory. Yet another difference is the order of bits in 
bit fields. These, and other factors, present quite a challenge to the 
developer. 

When I originally began the design of the predecessor to DOS-C, I 
had already tackled portability problems in my designs of real-time and 
embedded systems using both 8-bit and 16-bit processors. I had also 
standardized on C in the early 1980s to help bring these products to 
market faster. This experience proved invaluable when a potential cus- 
tomer approached me to make it available on a non-Intel platform. 
Encouraged by this customer, I undertook the task of writing a portable 
operating system. 

One important part of the design was to ensure processor indepen- 
dence in the code. The first step in achieving this goal was to minimize 
the amount of assembly code. The second step was to guarantee porta- 
ble C code with stringent design rules to guarantee word size and byte 
order independence. 

I guaranteed meeting these design rules with the architecture 
described in Chapter 3 and by designing a portable set of types that I 
could change to match the capability of the target processor with the 
use of the C preprocessor ##ifdef. These portable types allow me to 
guarantee that the physical disk structures are processor neutral. The 
need for portable types becomes evident by studying the DOS file sys- 
tem. A quick examination of the MS-DOS FAT file system reveals 
three simple types: 8-bit, 16-bit, and 32-bit. I achieved part of the goal 
by simply casting the 8-bit type to char and the 32-bit type to Jong. 
However, an int is not a good choice for the 16-bit type because its 
length may be either 16-bit or 32-bit depending on the target processor. 
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Casting the 16-bit type to a short guarantees the 16-bit length. Addi- 
tionally, I needed to guarantee the representation of these variables as 
both signed and unsigned values, so creating another unsigned set 
guaranteed non-negative values for ANSI-C compilers. Unfortunately, 
one of the target machines only had K&R C available, which compli- 
cated the issue because K&R C doesn’t have some unsigned types. 

I did not want to depend on familiarity with C types in order to recog- 
nize the size of the variable, so I decided to create aliases so they stand 
out in even the most cursory browsing of the code. I chose the following: 


typedef char BYTE; 
typedef short WORD; 
typedef long DWORD ; 


typedef unsigned char _UBYTE; 
typedef unsigned short UWORD; 


typedef unsigned long ULONG; 


typedef short SHORT ; 
dHifdef STRICT 

typedef signed long LONG; 
telse 

#Hdefine LONG long 
dtendif 


Pointer size and organization are another area of concern for com- 
patibility. As you know, the 80x86 family has a real mode designed to 
be compatible throughout the entire family. This guarantees upward 
compatibility from the 8088. DOS is a real mode operating system 
originally designed for the 8088. Many of the DOS system calls require 
far pointers in the form of segment:offset. To take advantage of the 
32-bit linear protected mode, a layered product, such as a DOS 
extender or the Windows GUI, supplies protected mode enhancements. 
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This means that DOS clone designers must make use of C constructs 
such as far and memory models such as small. Because DOS-C itself 
is small enough to fit into the small model (separate code and data 
areas, each less than 64Kb), I decided to take advantage of the small 
model so that I would have tighter and faster code than if I used the 
large model (separate code and data areas, each with as much as |Mb 
of memory). However, the tradeoff for this decision is that any refer- 
ence to memory outside of DOS-C must be a far reference, and the 
segment:offset pair must be adjusted. Unfortunately, compilers for lin- 
ear machines may not support the far and near keywords. The method 
I chose to simplify this requirement is: 


iHifdef <segmented machine> 
## define FAR far 

it define NEAR near 

ftendi f 


fHifdef <linear machine> 
i} define FAR 

it define NEAR 

fendi f 


With this approach you compile all files with a command line option 
to switch between the architectures (i.e., -D 186 for 80x86, - DMC68K for 
680x0, etc.). When the preprocessor completes its pass, architecture- 
dependent features such as the far are expanded to either “far” or “” 
depending on the architecture. 

Finally, the issue of byte order differences between in-memory and 
disk images of file system variables is addressed whenever data is 
transferred between a disk data structure member and a memory struc- 
ture member. The in-memory version is always the native representa- 
tion by default. I did not want to waste time by manipulating it every 
time the program needed to access an in-memory variable. However, 
when I transfer the variable to or from a disk buffer, I invoke the appro- 
priate conversion function. This function is either a preprocessor macro 
or a C function, depending on whether or not the target is a native 
80x86 processor. When I compile a native version, a preprocessor 
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switch bypasses the conversion function for the sake of efficiency. This 
was handled by the following code: 


tHifdef NATIVE 

## define getlong(vp. 1p) (*(LONG *)(1p)=*(LONG *)(vp)) 

## define getword(vp, wp) (*(WORD *)(wp)=*(WORD *)(vp)) 

## define getbyte(vp, bp) (*(BYTE *)(bp)=*(BYTE *)(vp)) 

## define fgetlong(vp, 1p) (*(LONG FAR *)(1p)=*(LONG FAR *)(vp)) 
# define fgetword(vp, wp) (*(WORD FAR *)(wp)=*(WORD FAR *)(vp)) 
i# define fgetbyte(vp. bp) (*(BYTE FAR *)(bp)=*(BYTE FAR *)(vp)) 
## define fputlong(1p, vp) (*(LONG FAR *)(vp)=*(LONG FAR *)(1p)) 
# define fputword(wp, vp) (*(WORD FAR *)(vp)=*(WORD FAR *)(wp)) 
it define fputbyte(bp, vp) (*(BYTE FAR *)(vp)=*(BYTE FAR *)(bp)) 
#felse 

VOID getword(VOID *, WORD *); 

VOID getbyte(VOID *, BYTE *); 

VOID fgetlong(VOID FAR *, LONG FAR *); 

VOID fgetword(VOID FAR *, WORD FAR *); 
VOID fgetbyte(VOID FAR *, BYTE FAR *); 
VOID fputlong(LONG FAR *, VOID FAR *); 
VOID fputword(WORD FAR *, VOID FAR *); 
VOID fputbyte(BYTE FAR *, VOID FAR *); 
#tendif 


The conversion functions themselves are in the file syspack.c 
(Listing A.1). 
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ListingA.I syspack.c 


[RRR IKK KK KKK KKK IK KKK IKK KIKI KK IK IKK IKK IKK KKK KKK KKK KKK KAKA KK KI K / 
*/ 
SYSPACK.C */ 


System Disk Byte Order Packing Functions 


Copyright (c) 1995 
Pasquale J. Villani 
All Rights Reserved 


This file is part of DOS-C. 


DOS-C is free software; you can redistribute it and/or 

modify it under the terms of the GNU General Public License 

as published by the Free Software Foundation; either version */ 
2, or (at your option) any later version. */ 


DOS-C is distributed in the hope that it will be useful, but 
WITHOUT ANY WARRANTY; without even the implied warranty of 
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See 
the GNU General Public License for more details. 


You should have received a copy of the GNU General Public 
License along with DOS-C; see the file COPYING. If not, 
write to the Free Software Foundation, 675 Mass Ave, 
Cambridge, MA 02139, USA. 


[RRR I KK IK KK KK KK KK KK KKK KK IK IK KK IK IKI IKK I KIRK IKI KKK KKK KK AK KK AK / 


#Hinclude "../../hdr/portab.h"” 
#Hinclude "globals.h" 


/* $Logfile: D:/dos-c/src/fs/syspack.c_v §$ */ 

#Hifndef IPL 

Static BYTE *syspackRcsId = "$Header: 
D:/dos-c/src/fs/syspack.c_v 

May 1996 21:15:12 patv §$"; 

dtendi f 
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ListingA.I syspack.c—continued 


~ 
* 


$Log: D:/dos-c/src/fs/syspack.c_v $ 


Rev 1.3. 29 May 1996 21:15:12 = patv 
bug fixes for v0.9la 


Rev 1.2 O01 Sep 1995 17:48:42  patv 
First GPL release. 


Rev 1.1 30 Jul 1995 20:50:26 patv 
Eliminated version strings in ip] 


Rev 1.0 02 Jul 1995 8:05:34 _ patv 
Initial revision. 


+ + % FF F FF F F F HH He 


* 
iy 


#Hifdef NONNATIVE 
VOID 
getlong (REG VOID *vp, LONG *1p) 

( 

*1p = (((BYTE *)vp)(0) & Oxff) + 
(CC(BYTE *)vp)C1] & Oxff) « 8) + 
(CC(BYTE *)vp)(2] & Oxff) << 16) + 
(CC(BYTE *)vp)([3] & Oxff) <« 24); 

] 


VOID 
getword (REG VOID *vp, WORD *wp) 
( 
*wo = (((BYTE *)vp)(0O] & Oxff) + ((((BYTE *)vp)[1) & Oxff) <« 8); 
J 


VOID 
getbyte (VOID *vp, BYTE *bp) 
( 
*bp = *((BYTE *)vp); 
J 
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Listing A.I syspack.c— continued 


VOID 

fgetword (REG VOID FAR *vp, WORD FAR *wp) 

{ 

*wo = (((BYTE FAR *)vp)(0O) & Oxff) + 
((((BYTE FAR *)vp)[1] & Oxff) << B); 

J 


VOID 
fgetlong (REG VOID FAR *vp, LONG FAR *1p) 

( 

*Ip = (( (BYTE *)vp)(O) & Oxff) + 
(CC(BYTE *)vp)C1) & Oxff) << B) + 
C(CC(BYTE *)vp)£2) & Oxff) << 16) + 
((C(BYTE *)vp)(3) & Oxff) <«< 24); 

} 


VOID 
fgetbyte (VOID FAR *vp, BYTE FAR *bp) 
{ 
*bp = *((BYTE FAR *)vp); 
} 


VOID 
fputlong (LONG FAR *1Ip, VOID FAR *vp) 
( 


REG BYTE FAR *bp = (BYTE FAR *)vp; 


bp(0O] = *Ip & Oxff; 

bpl1)] = (*lIp » B) & Oxff; 
] = (*Ip >> 16) & Oxff; 
] = ( 


*lp >» 24) & Oxff; 


bp[2 
bp[3 
J 


VOID 

fputword (WORD FAR *wp, VOID FAR *vp) 
{ 

REG BYTE FAR *bp = (BYTE FAR *)vp; 


bp[0) = *wp & Oxff; 
bp[1] = (*wp » B) & Oxff; 
) 
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Listing A.1  syspack.c—continued 


VOID 

fputbyte (BYTE FAR *bp, VOID FAR *vp) 
{ 

*(BYTE FAR *)vp = *bp; 


| 
endif 


VOID 
getdirent (BYTE FAR *vp, struct dirent FAR *dp) 

{ 

fbcopy(&vp(DIR_NAME], dp -> dir_name, FNAME_SIZE); 
fbcopy(&vp(DIR_EXT], dp -> dir_ext, FEXT_SIZE); 
fgetbyte(&vp[DIR_ATTRIB), (BYTE FAR *)&dp -> dir_attrib); 
fgetword(&vp[DIR_TIME], (WORD FAR *)&dp -> dir_time); 
fgetword(&vp[DIR_DATE], (WORD FAR *)&dp -> dir_date):; 
fgetword(&vp[DIR_START], (WORD FAR *)&dp -> dir_start); 
fgetlong(&vp(DIR_SIZE], (LONG FAR *)&dp -> dir_size); 

} 


VOID 

putdirent (struct dirent FAR *dp, BYTE FAR *vp) 
{ 

REG COUNT i; 

REG BYTE FAR *p; 


fbcopy(dp -> dir_name, &vp[DIR_NAME], FNAME_SIZE); 
fbcopy(dp -> dir_ext, &vp[DIR_EXT], FEXT_SIZE); 
fputbyte((BYTE FAR *)&dp -> dir_attrib, &vp(DIR_ATTRIB]); 
fputword( (WORD FAR *)&dp -> dir_time, &vp[DIR_TIME])); 
fputword((WORD FAR *)&dp -> dir_date, &vp{[DIR_DATE])); 
fputword( (WORD FAR *)&dp -> dir_start, &vp(DIR_START)): 
fputlong((LONG FAR *)&dp -> dir_size, &vp[DIR_SIZE]); 
for(i = 0, p = (BYTE FAR *)&vp(DIR_RESERVED]; i < 10; i++) 
*p++ = NULL; 
} 





Appendix B 


The FreeDOS Project: 
Official FAO File 


This document is intended to answer frequently asked questions that 
arise regarding the FreeDOS Project and its goals. Please take the time 
to read this document to determine if your questions are answered here 
before contacting project coordinators. 


Revision History 


3 August 1996, M. “Hannibal” Toal, First Draft 
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General Information 


What Is the Goal of the FreeDOS Project? 


The goal of the FreeDOS Project is to create a completely free MS-DOS- 
compatible operating system. FreeDOS will run on all DOS-capable plat- 
forms, from XTs to Pentium Pros. 


What Is the Reason for FreeDOS? 


There are many users who either do not have access to hardware that is 
capable of running today’s 32-bit operating systems (such as Windows 95, 
OS/2, and UNIX) or who do not require the complexity of those systems. 

FreeDOS will be beneficial for businesses, schools, and organiza- 
tions because there are no royalty payments. FreeDOS will be an option 
for hobbyists, hackers, and anyone who would enjoy a chance to exam- 
ine or customize the source code and inner workings of a real operating 
system for educational, practical, or recreational purposes. 


What Will FreeDOS Not Be? 


FreeDOS will not be multitasking. It will not be object oriented. It will 
not include a Graphical User Interface, a flat memory model, or operate 
in 32-bit protected mode. 

It’s not that we do not desire any of these things. It is fun and inter- 
esting to speculate about the capabilities of an advanced, 32-bit version 
of FreeDOS. However, it is the goal of FreeDOS to complete a 16-bit 
MS-DOS-compatible operating system first, before proceeding in dif- 
ferent directions. 
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Who Owns FreeDOS? 


Everybody and nobody. FreeDOS and its associated programs and doc- 
umentation are protected under the terms of the GNU General Public 
License (GPL). This means that FreeDOS can be freely copied, so long 
as there is no attempt to restrict further copying. All copyrights are 
retained by the original authors. All source code will be publically 
available. See the full text of the GPL, included with FreeDOS, for 
more information. 


How Is FreeDOS Project Development Organized? 


The FreeDOS Project Committee, a group of the senior contributors 
and participants involved with FreeDOS, is a group of individuals that 
exists to make decisions and set standards for the FreeDOS Project. 

The FreeDOS project consists of a number of specific groups and 
each of these groups has a coordinator. These groups are the utility, 
shell, kernel, testing, and documentation groups. 


Technical Information 


What Exactly Is Included in FreeDOS? 


FreeDOS will be a complete, independent operating system. It includes 
a kernel, a shell, and a full complement of utilities. FreeDOS uses a 
freeware C compiler for all development work. Each is described sepa- 
rately in the following text. 
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What Is the DOS-C Kernel? 


The kernel is the heart of an operating system. It is a special program 
that contains low-level functions and routines to bridge the gap between 
hardware and software. Pat Villani’s DOS-C is the kernel that forms the 
basis for the FreeDOS operating system. DOS-C, as the name implies, 
is written entirely in C and will eventually be 100 percent MS-DOS 
compatible. 


What Is the FreeDOS Shell? 


A shell is an interactive program that is used to load and run other pro- 
grams and usually also provides a basic set of tools and functions nec- 
essary to make a computer actually useful. Users familiar with MS-DOS 
will know this program as command.com. 

The FreeDOS command. com (written by Tim Norman in cooperation 
with several other developers) provides piping, redirection, command 
line history and editing, several built-in functions, and batch file capabil- 
ity. Future versions will add enhancements such as command aliasing. 


What Are the FreeDOS Utilities? 


The utilities that accompany an operating system are designed to 
enable a user to perform productive tasks and to control and customize 
the computing environment. 

The FreeDOS utilities include the traditional assortment of programs 
that MS-DOS users will find familiar, from attrib to vol. Some will 
be slightly different, some will be enhanced, and some will be new pro- 
grams that should have been included with MS-DOS long ago. 


What Is the DDS MICRO-C Compiler? 


The MICRO-C compiler is a free C compiler writtten by Dave Dunfield 
of Dunfield Development Systems. MICRO-C is not officially a part of 
FreeDOS, as it is not published under the terms of the GNU General 
Public License. MICRO-C is a capable and well-designed free product 
that was chosen as the standard compiler for FreeDOS programs until a 
free compiler capable of producing 8086-compatible code is available. 
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Will FreeDOS Run on Some Particular Machine? 


Probably. The DOS-C kemel will run on virtually all hardware that is 
truly IBM-PC compatible. Very old hardware, or in some cases very 
new or unusual hardware, may have problems running DOS-C. For 
example, the DOS-C kernel does not currently deal properly with sec- 
ond hard drives. 


Will FreeDOS Run Some Particular Program? 


Maybe. Most of the problems that people have when running FreeDOS 
are due to either bugs or unfinished parts of the DOS-C kemel. The lat- 
est version (vO.91a) of the DOS-C kernel is able to run WordPerfect 5.1 
and DOOM. Microsoft Windows does not currently run under DOS-C. 


Will FreeDOS Be Compatible with 
Some Particular Version of MS-DOS? 


FreeDOS will be compatible with MS-DOS v3.30. The reason I chose 
version 3.30 as a target is that there are very few fundamental changes 
between MS-DOS v3.30 and subsequent versions. 


Specific Answers to Common Questions 


Why Not Use DJGPP? 


DJGPP is a fine product, but it generates code for only 386+ machines. 
Because the goal of FreeDOS is to run on all PC-compatible machines, 
DJGPP is not suitable for FreeDOS development. 
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Why Not Add 32-bit Multitasking or 
Other Advanced Features? 


Because we need to ship a 16-bit single-tasking product first. No such 
free operating system exists. If we keep adding more and more features 
to what we already have, this will indefinitely prolong and delay com- 
pletion of FreeDOS. 

The goals of the FreeDOS Project are clearly defined. If you have 
need for a free 32-bit multitasking operating system, we recommend 
Linux or FreeBSD. Don’t worry, you won’t hurt our feelings. 


Why Use MICRO-C? 
It Doesn’t Support floats or typedefs! 


This is absolutely true. Despite MICRO-C’s few, specific limitations, it 
is an excellent and free product. It is well suited to the task of construct- 
ing small command-line-oriented utilities. 

Seriously — do you need to pay hundreds of dollars for an advanced 
commercial C/C++ compiler to develop a program like xcopy? By 
choosing a free compiler to produce a free operating system, not only is 
the FreeDOS project philosophically consistent with its own goals, but 
it is easier for new people to participate in development because there is 
no specific compiler to purchase. 


But Wait! Why Is the DOS-C Kernel Written in Borland C? 


Development of the DOS-C kernel predates the birth of the FreeDOS 
Project. Because of the DOS-C kernel’s unique position as the founda- 
tion of the FreeDOS operating system and because of the time and 
effort that would be required to port it to MICRO-C or some other com- 
piler, DOS-C development will continue under Borland C. 
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1 Can’t Get Some Program to Work. 
Is FreeDOS Junk or What? 


You could throw up your hands and delete FreeDOS from your hard 
disk if you run into problems while using it. Or you could be part of the 
solution by sending us a detailed report of the nature of your particular 
problem. If we are not aware of problems, we cannot correct them. 


What Happened to FreeDOS? 
It Seemed Dead for a Long Time. 


FreeDOS suffered an extended period of inactivity due to the personal 
obligations of project coordinator M. “Hannibal” Toal. Now that this 
period has passed, activity has resumed anew. 


What Happened to James Hall, 
the Originator of FreeDOS? 


James Hall graduated from college in 1995, and due to his personal and 
career obligations, was no longer able to continue involvement with the 
FreeDOS Project. 


How to Obtain More Information 


Where Is the Official FreeDOS Web Page? 
The address of the FreeDOS web page is: 


http://sunsite.unc.edu/pub/micro/pc-stuff/freedos/freedos.nhtml 
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Where Is the Official FreeDOS FTP Site? 
The address of the FreeDOS FTP site is: 


% ftp sunsite.unc.edu 
ftp> cd pub/micro/pc-stuff/freedos 


Is There a Mailing List for FreeDOS? 
The FreeDOS mailing list is maintained by the listserver on vpro.n]. 
To subscribe to the mailing list: 


mail free-dos-request@vpro.n| 
Subject: subscribe your@email.address 


Any body text in the message is unnecessary and will not be read. 
To cancel a subscription to the mailing list: 


mail free-dos-request@vpro.nl 
Subject: unsubscribe your@email.address 


You should use the same address that you used when subscribing. 
To post a message to the mailing list: 
mail free-dos@vpro.nl 
Note that only mail to be posted to the mailing list should be sent to the 


address free-dos@vpro.nl. All new subscription and subscription 
cancellation requests should be sent to f ree-dos-request@vpro.n}. 
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Is There a Usenet Newsgroup for FreeDOS? 


FreeDOS does not have an official newsgroup. However, official 
announcements will occur from time to time on the following Usenet 
Newsgroups: comp.os .msdos. programmer and alt .msdos.programmer . 


Who Are the Contact People for FreeDOS? 


Project Coordinator M. “Hannibal” Toal 
mtoal@arctic.nsbsd.kl2.ak.us 


Kernel Development Pat Villani 
patv@iop.com 


Shell Development Tim Norman 
normat@rpi.edu 


Utility Development: Bryon Quackenbush 
odo@cris.com 


Documentation BearHeart (Bill Weinman) 
bearheart@bearnet.com 


Testing This could be you! 
your@name .here 
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What Other Documents Exist? 

Compatibility File — List of programs known to run under FreeDOS.* 
FreeDOS Manifesto A philosophical statement of Project goals. 
Status File Lists the status of all parts of the Project. 


Development Standards Guidelines for FreeDOS Project develop- 
ment.* 


* Denotes documents yet to be written. 
FreeDOS is an ongoing project and documentation will be continu- 
ally updated. Check the FreeDOS home page at: 


URL http://sunsite.unc.edu/pub/micro/pc-stuff/freedos 


for the current status of all documentation. 


How Can I Participate in FreeDOS Development? 


First, congratulations on making it this far! You should now have an 
idea about what FreeDOS is, what it is not, and where the project is 
heading. If you would like to become more involved: 


¢ Subscribe to the FreeDOS mailing list. Introduce yourself. 
¢ Download the latest FreeDOS release and try it out. 
¢ Obtain the Compatibility File and test some new unlisted programs. 


¢ Send us your results and opinions! 
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If you are a crack programmer, or even an aspiring crack programmer: 
¢ Obtain and review the Status File and Development Standards. 
¢ Download MICRO-C and try it out a little bit. 
¢ Identify a task and contact the appropriate development leader. 
¢ Hack some quality code and send it in! 


Actually, there is not that much new code to be written. A lot of what 
needs to be done is porting existing code to MICRO-C, along with 
extensive testing and error reporting to assist Pat Villani with DOS-C 
development. FreeDOS is 90 percent completed, but the last 10 percent 
always seems to take 90 percent of the time. 


Appendix C 


DOS-C 
Source Code 


The full source code for this book is now available on the pub- 
lisher’s ftp site at ftp.mfi.com/pub/rdbooks/FreeDOS.z ip; 
login as “anonymous” and download the file. Any references to 
the “companion code diskette” in the book now refer to the code 
available on the ftp site. For more information, see the FreeDOS 
home page at 

http: //sunsite.unc.edu/pub/micro/pc- stuf f/freedos 
or my personal web page at 

http: //www.monmouth.com/user_pages/patv 
for updates. 
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Before You Start 


I have made every effort to ensure that the code for this book builds and 
boots on an IBM-compatible PC. You could simply copy the subdirec- 
tory dos-c to your hard drive and build a copy of DOS-C on your com- 
puter. However, I’d recommend that you do all your initial testing either 
on a separate test computer or restrict all operations to floppy disk as 
you verify its operation. Make multiple copies of the distribution code 
on disks from which you can work. If you are working with a copy and 
should accidentally destroy it, you can simply make another copy of the 
code disk and continue working. 


Building DOS-C from the Diskette 


Copy the dos -c directory using xcopy, or a similar command, to a disk 
drive containing a minimum of 5Mb of free space. The distribution is 
not that large itself, but it requires that amount of space to compile. To 
build the operating system, a batch file (build.bat) is included in the 
dos -c directory. On invocation, build.bat proceeds to each subdirec- 
tory and builds the boot, kernel, ipl, and utilities. When it completes, 
the newly created files are in dos-c\dist. In addition, there is a corre- 
sponding batch file (clean.bat) to clean up the source directories by 
removing leftover files, such as * .obj, *. bak, etc. 
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Directory Structure 


dos-c root directory 
+----- dist holds image of distribution disk 
+----- doc documentation directory 
+----- hdr common * .h files 
+----- lib libm.1ib and device.lib 
+----- src source directories for: 
+-------- +----- boot boot.bin 
qorneest +-----command command.com and help.exe 
Hpete anid +-----drivers device.lib 
+-------- +----- fs common kernel and ip] fs manager files 
ipeceusie +-----ip] ipl.sys 
Fieeeaeee +-----kerne] kernel .exe 
Pile 4cesemi Se miscellaneous files for kernel and ip] 
t-------- +-----tmp 
qpcceneus +-----utils Sys .exe 


Organization in a Nutshell 


Each component or group of utilities is segregated into its own direc- 
tory. Whenever common files are needed, they are removed and placed 
in a separate directory. To build that component or utility, a makefile 
exists in the directory that bears the component’s or utility’s basename. 

Each makefile has at least two targets: production and clean. 
production builds the expected component or utility and clean 
cleans up the directory for distribution. The makefile may have at least 
one additional target that builds the component. I recommend that you 
examine the component’s makefile to better understand how the com- 
ponent is built. 
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Building a Bootable Floppy 


To create a bootable floppy: 


1. Format a new floppy. Do not enter a label for the disk; otherwise, the 
Sys utility will report an error and abort. 


2. Make certain you are in the directory where the DOS-C distribution 
files are stored. This is normally dos-c\dist. 


3. Enter the command: 
SYS a: 
or 
sys Db: 


to transfer the system files to the diskette. 
4. Write protect this disk and boot from it. 


Check the distribution disk for additional instructions that are unique to 
this release. 


DOS-C and FreeDOS 


Jim Hall began the FreeDOS project when Microsoft began making 
public statements that it might not support MS-DOS in the future. This 
statement of Microsoft’s marketing plans almost immediately makes 
every IBM PC, XT, AT, and clones of these machines obsolete. If your 
machine is not at least an 80386 (or 80486DX2-66, as it tums out), 
MS-DOS v6.22 is the last new release available to you. Combine this 
with the software market following Microsoft, and new releases of 
popular applications are not likely, either. This move condemns you to 
work with existing applications with the feature/functionality gap 
growing every day. Jim decided that he would drum up support for an 
independent MS-DOS operating system that would eventually pick up 
where Microsoft left off. 
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At about the same time, I was working with ]inux and stumbled 
across the dosemu project. I tried to run my DOS/NT real-time operat- 
ing system and found that I could boot it under dosemu. I donated my 
shareware version to the project in the hopes that it would help them 
while they continued to build DOS features into the emulator. I even 
considered taking the responsibility of building DOS emulation into 
dosemu since I was familiar with the interface. Jim Hall was on the 
dosemu mailing list and became excited when he heard the news. 

DOS-C came into being when I was approached by Jim Hall. Jim 
told me about his effort and I decided not to build the interface into 
dosemu, but instead to build a stand-alone kernel that could also func- 
tion as the dosemu kernel. I rewrote the original DOS-C kemel to 
remove copyrighted material that I did not own because the FreeDOS 
group wanted to be able to distribute source code under the GNU Pub- 
lic License. The result is what I have described in this book. 

However, neither DOS-C nor FreeDOS are static. There are new fea- 
tures and functions being added daily. There are new commands and 
kermel enhancements, as well as bug fixes. Compatibility problems that 
may be uncovered during kemel testing are fixed as they arise, intro- 
ducing yet another reason for continuous updating. Finally, there are 
plans to move DOS-C and FreeDOS forward with features not found in 
MS-DOS, so that future versions will have a broader appeal. 

With this in mind, I urge you to check the FreeDOS home page at 
http://sunsite.unc.edu/pub/micro/pc-stuff/freedos or my per- 
sonal web page at http: /www.monmouth.com/user_pages/patv for 
updates. 


Please Help Us 


The FreeDOS project is free software. It relies on the kindness of peo- 
ple like you to pitch in and help create new utilities, document the 
project, or simply test new software. That’s where you come in. If you 
can, please contact me at pat v@i op .comor any of the individuals listed 
in the FAQ (See Appendix B). The success of the project depends on 
your support. 
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FreeDOS Kernel explains the construction and operation of Pat Villani’s DOS-C — 
a highly portable, single threaded operating system. Written in C and with system 
calls similar to MS-DOS, the FreeDOS kernel provides an excellent source code base 
for experimentation. Study it, modify it and use it without getting lost in the 
complexity of most microkernels. The book and full source code (available on the ftp 
site) include everything you need for an 80X86 kernel and support files. 


» FreeDOS uses the de facto DOS hardware standards and provides binary 
compatibility for MS-DOS applications. 

» FreeDOS compiles with Borland C, Microsoft C and other C cross-compilers 
without using their run-time libraries. 

» FreeDOS is the kernel provided by the FreeDOS community on the Internet. 


The FreeDOS kernel provides embedded systems applications with the functionality 
of larger operating systems, including file storage, embedded databases, and 
sophisticated device control. Simplify the design of your embedded systems by 
using your PC for development and then linking your own version of FreeDOS to 
create the application ROM. 
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